perm filename NOTES.JRA[LSP,JRA]17 blob
sn#192399 filedate 1975-12-16 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00054 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00006 00002 .SEC(Evaluation of LISP expressions,Evaluation,,P113:)
C00030 00003 .SS(S-expr translation of LISP expressions)
C00041 00004 .SS(Symbol tables,symbol tables,P92:)
C00049 00005
C00052 00006 .SS(%3λ%*-notation,%3λ%*-notation,P49:)
C00067 00007 .SS(Mechanization of evaluation,,P6:)
C00081 00008 .SS(Examples of %3eval%*,examples of %3eval%*)
C00086 00009 It should now be clear that %3eval%* and friends %2do%* begin to
C00091 00010 .<<the | | symbol table >>
C00100 00011 .SS(Free variables,free variables,P135:)
C00106 00012 .SS(Environments and bindings,,P77:)
C00115 00013 .<<RED'S FACT>>
C00119 00014 Notice in this example we will get the correct binding of %3x%* locally.
C00123 00015 .CENT(Problems)
C00125 00016 .SS(%3label%*,%3label%*,P38:)
C00132 00017 .SS(Functional Arguments and Functional Values,functional argument,P76:)
C00166 00018 .<<pic of funarg execution>>
C00179 00019 .SS(Binding strategies,binding strategy,P134:)
C00182 00020 .SS(special forms,special form,P8:)
C00189 00021 .SS(The %3prog%*-feature,,P37:)
C00203 00022 Now that assignment statements are out in the open let's reexamine "<=".
C00206 00023 .CENT(Alternatives to %3prog%*)
C00207 00024 .SS(Rapprochement: In retrospect,,P90:)
C00224 00025 .CENT(More postmortem)
C00234 00026 .SEC(Running on the machine,,P175:)
C00240 00027 .SS(%3evalquote%*,%3evalquote%*)
C00248 00028 .SS(A project: extensions to %3eval%*,%3eval%*,P67:)
C00257 00029 .SS(A project: Syntax-directed processes,syntax-directed)
C00258 00030 .SEC(Implementation of the Static structure of LISP,static structure)
C00264 00031 .SS(Representation of Symbolic expressions)
C00278 00032 .SS(Representation of LISP primitives,,P26:)
C00291 00033 .SS(Symbol tables: revisited,,P128:)
C00306 00034 .<<atom struct for fact>>
C00326 00035 .<<pic of NIL>>
C00331 00036 .SS(A first look at %3cons%1)
C00337 00037 .SS(Storage management: garbage collection,garbage collection,P146:)
C00360 00038 .CENT(A simple LISP garbage collector written in LISP)
C00369 00039 .CENT(Problems)
C00372 00040 .SS(Input/Output: %3read%* and %3print%*,,P13:)
C00384 00041 .SS(Symbol table searching: Hashing,hashing,P14:)
C00392 00042 .GROUP
C00395 00043 .SS(A review of the structure of the LISP machine,LISP machine)
C00402 00044 .SS(Binding revisited,bind,P78:)
C00408 00045 .SS(%aSM%*-%3eval%*,%3eval%*,P30:)
C00428 00046 .CENT(An example of shallow binding and %3FUNARG%*)
C00433 00047 Compare the shallow binding strategy to that of the a-list. The a-list was always
C00440 00048 .SS(Macros and special forms,macros,P18:)
C00457 00049 .CENT(Problems)
C00459 00050 .SS(An example of %aSM%3-eval%1)
C00465 00051 .SS(Progs in %aSM%3-eval%1)
C00471 00052 .SS(An alternative: deep bindings,deep binding,P148:)
C00485 00053 .SS(Epilogue)
C00489 00054 .END "JRA"
C00490 ENDMK
C⊗;
.SEC(Evaluation of LISP expressions,Evaluation,,P113:)
.BEGIN indent 20,20;
"... I always worked with programming languages because it seemed to me
that until you could understand those, you really couldn't understand
computers. Understanding them doesn't really mean only being able to use
them. A lot of people can use them without understanding them. ..."
.END
.BEGIN TURN ON "→";
→Christopher Strachey, %3The Word Games of the Night Bird.%*
.END
.BEGIN "JRA" TURN ON "←","#";
.SS(Introduction)
%1
In the previous sections of this text we have talked
about some of the schemes for evaluation. We have done so rather
informally for LISP; we have been more precise about evaluation
of simple arithmetic expressions. {yonss(P2)} discusses this in
some detail.
We shall now look more closely at the informal
process which we have been using in the ⊗→evaluation↔← of LISP expressions.
This is motivated by at least two desires.
First, we want to run our LISP programs on a machine.
To do so requires the implementation of some kind of
translators to turn LISP programs into instructions which
can be carried out by a conventional machine.
We will be interested in the structure of such implementations.
Any implementation of LISP must be grounded on a precise,
and clear understanding of what LISP-evaluation
entails. Indeed, a deep understanding
of evaluation is a prerequisite for implementation of %2any%* language.
The question of evaluation cannot be sidestepped by basing a language
on a compiler. A compiler must produce code which when executed, simulates
the evaluation process. There is no escape.
Our second reason for pursuing evaluation involves the question
of programming language specification. This title covers a multitude
of sins.
At a practical level we want a clean, machine independent, "self-evident"
description of a language specification, so that the agony involved
in implementing the design can be minimized. At a more abstract level,
we should try to understand just what %2is%* specified when we design a
language. Are we specifying a single machine, a class of machines, a
class of mathematical functions; just what is a language?
The syntactic specification of languages is reasonably well established,
but syntax is only the tip of the iceberg. Our study of LISP
will address itself to the deeper problems of semantics, or meaning,
of languages.
Before we address the direct question of LISP evaluation, we
should perhaps wonder aloud about the efficacy of studying languages in the
detail which we are proposing.
First, as computer scientists we should be curious about the
structure of programming languages because we
must understand our tools
--#our programming#languages. People who simply wish to
%2use%* computers as tools need not
care about the structure of languages. Indeed they usually
couldn't care less about the inner workings of the language; they only
want languages
in which they can state their problems in a reasonably
natural manner.
They want their programs to run and get results. They are interested in
the output and seldom are interested in the detailed process of
computation.
For a simple analogy, consider the field of mathematics.
The practicing mathematician uses his tools --#proofs#-- in a similar
manner to the person interested in computer applications.
He seldom needs to examine questions like "what is a
proof?" He does not analyze his tools.
However it must be pointed out that not so many years ago
such questions indeed were raised, and for good reason.
Some common forms of reasoning were shown to lead to
contradictions unless care was taken.
Our position is more like that of the foundations of mathematics;
there the tools of mathematics %2are%* studied and held up to the
light. Mathematics has flourished because of it. Though our expectations
are not quite that presumptuous, we %2do%* expect that
programming language design cannot help but be improved.
Our study of language implementation will proceed from the abstract to
the concrete. Each level will intimately involve the study of data structures.
The current chapter will be the most abstract, building a precise
high-level description of an evaluation scheme for LISP. In fact,
the discussion is much more general than that of LISP; the text addresses
itself to problem areas in the design of any reasonably sophisticated
language. In subsequent chapters we probe beneath the surface of this
high-level description and discuss common ways of implementing
the necessary data structures.
But now, how can be begin to understand LISP evaluation?
In {yonss(P2)} we made a beginning, giving an algorithm for
a subset of the computations expressible in LISP.
This subset covered evaluation of some simple arithemetic expressions.
From our earliest grade school days we had to evaluate simple
arithmetic expressions. Later, in algebra we managed to cope with
expressions involving function application. Most of us survived the
experience. We should now try to understand the processes we used in these
simple arithmetic cases, doing our examination at the most mechanical
level.
Those parts which are not mechanical⊗↓Are machine-independent, if you
wish.←, should be remembered. We will make reasonable choices so that
the process %2does%* become mechanical, and proceed. Later, we should
reflect on what effect our choices had on the resulting scheme ⊗↓implementation←.
The first thing to note in examining simple arithmetic
examples is that %2nothing%* is really said about the process
of evaluation.
When asked to evaluate
%3(2*3)#+#(5*6)%* we never specified which summand was to be evaluated first.
Indeed it didn't matter here. %36#+#(5*6)%* or %3(2*3)#+#30%* both
end up %336%*.
Does it %2ever%* matter? "+" and "*" are examples of arithmetic functions;
can we always leave
the order of evaluation unspecified for arithmetic functions?
How about evaluation of arbitrary functional expressions?
If the order doesn't matter, then the specification of the evaluation
process becomes much simpler. If it %2does%* matter then we must
know why and where.
.P98:
We have seen that the order of evaluation %2can%* make a difference in LISP.
On {yon(P106)} we saw that LISP's computational
interpretation of function evaluation requires some care.
On {yon(P17)} we saw that order of
evaluation in conditional expressions can make a difference.
We must make %2some%* decision regarding
the order of evaluation of the arguments to a function call,
say %3f[t%41%*;t%42%*; ...t%4n%1]. We will assume that we will evaluate the arguments
from left to right.
Or consider the example due to J. Morris:
.P84:
.BEGIN CENTERIT;
←%3f[x;y] <= [x = 0 → 0; %et%* → f[x-1;f[y-2;x]]]]%1.
.END
Evaluation of %3f[2;1]%*
will terminate if we always evaluate the leftmost-outermost occurrence of %3f%*.
Thus:
←%3f[2;1] = f[1;f[-1;2]] = f[0;f[f[-1;2]-2];-1] = 0;%*
However if we evaluate the rightmost-innermost occurrences first, the
computation will not terminate:
←%3f[2;1] = f[1;f[-1;2]] = f[1;f[-2;f[0;-1]]] = f[1;f[-2;0]] = ... .%*
So the choice of evaluation schemes is, indeed, fraught with peril.
The evaluation scheme which we choose
is called %2⊗→call-by-value↔←%* or %2inside-out%* evaluation,
and is the %aCBV%*-scheme of
{yon(P121)}. Alternative proposals exist
and have their merits; ⊗→call-by-name↔← (outside-in) evaluation is another
well-known scheme. We advertised this on {yon(P126)} as %aCBN%*.
From a computational
perspective, we can live with call-by-value. Though we know the use of such a rule
may lead to non-terminating computations when call-by-name would run to
completion, the efficient implementation available for call-by-value
usually outweighs any theoretical advantage possessed by the other leading
product.
Intuitively, call-by-value says:
evaluate the arguments to a function before you attempt to apply the arguments
to the function definition.
Let's look at a simple arithmetic example.
Let %3f[x;y]%* be %3x%82%*#+#y%1 and consider %3f[3+4;2*2]%*. Then
call-by-value says evaluate the arguments, getting %37%* and %34%*;
associate those values
with the formal parameters of %3f%* (i.e. %37%* with %3x%* and
%34%* with %3y%*) and then evaluate the body of
%3f%* resulting in #%37%82%*#+#4#=#57%1. This is the scheme we
captured in {yonss(P2)}.
Call-by-name says pass the %2unevaluated%* actual
parameters to the function, giving %3(3+4)%82%*#+#2*2%1
which also evaluates to %357%*.
Call-by-name is essentially a "substitution followed by simplification"
system of computation. Implementations of this scheme involve
unacceptable inefficiencies.
We will say more about call-by-name and other styles of
evaluation later. Most of this section will be restricted to call-by-value.
If you look at the structure of %3value%9''%1 and %3apply%9'%1 beginning
on {yon(P125)} you will see that they encode the %aCBV%* philosophy,
are recursive and have an intended interpretation which
goes something like this:
%21.%* If the expression is a constant then the value
of the expression is that constant. (the value of %33%* is %33%*
⊗↓We are ignoring the distinction between the %2numeral%* %33%*
and the %2number%3 3.%1←).
%22.%* If the expression is a variable then see what the current value associated
with that variable is. Within the evaluation of say
%3f[3;4]%* where %3f[x;y]#<=#x%82%*#+#y %1the current value of the variable
%3x%* is %33%*.
%23.%* The only other kind of arithmetic expression that we can have is a function
name followed by arguments, for example, %3f[3;4]%*.
In this case we first evaluate the arguments ⊗↓Here we are using the evaluation
process recursively.←;
and then apply the definition of the function to those
evaluated arguments. How do we apply the function definition
to the evaluated arguments?
We associate or ⊗→bind↔← the formal parameters (or variables)
of the definition to the values of the actual parameters.
We then evaluate the body of the function in this new environment.
Notice that we do %2not%* explicitly substitute the values for the variables
which appear in an expression. We simulate substitutions by table lookup.
.P80:
A moments reflection on the informal evaluation process
we use in LISP should convince us of the plausibility of
describing LISP evaluation at a similar level of precision.
First, if the LISP expression is a constant, then the value of the
expression is that constant.
What are the constants of LISP? They're just the S-exprs.
Thus
the value of %3(A#.#B)%* is %3(A#.#B)%*; just like the value of %33%* is %33%*.
The additional artifact of LISP which we have to include in a discussion of
evaluation is the conditional expression. But clearly its evaluation can also
be precisely specified. We did so on {yon(P40)}.
So, in more specific detail, here is some of the structure of the
LISP evaluation mechanism:
%21.%* If the expression to be evaluated is a constant then the value
is that constant.
%22.%* If the expression is a variable find its value in the current environment.
%23.%* If the expression is a conditional expression then it is of the form
%3[p%41%* → e%41%*; p%42%* → e%42%*; ... p%4n%* → e%4n%1]. Evaluate it
using the semantics defined on {yon(P40)}.
%24.%* If the expression is of the form: function name followed by
arguments then:
.BEGIN INDENT 10,10,10;GROUP;
%2a.%1 Evaluate the arguments (from left to right).
%2b.%1 Find the definition of the function.
%2c.%1 Associate the evaluated arguments with the formal parameters
in the function definition.
%2d.%1 Evaluate the body of the function, while remembering the values
of the variables.
.END
So we %2can%* describe the outline of a mechanical procedure which
is used in the evaluation of LISP functions.
Now for the final step.
We have seen ({yonss(P2)}) that we can transcribe
a simple kind of arithmetic evaluation into a recursive LISP program.
That program operates on a representation of the expression and
produces the value. Most of our work in that example was done
without giving explicit details of the representation.
However when we discussed the representation of simple differentiation
in {yonss(P41)} we showed a detailed representation.
We have demonstrated an informal, but reasonably precise, evaluation scheme
for LISP;
our discussion is ready for a more formal development.
It should be clear that we could write a LISP
function representing the evaluation process provided that we can
find a representation for LISP expressions as S-expressions.
This mapping, %eR%*, of LISP expressions to S-exprs is our first order of business. We
will accomplish this mapping by using an extension of the scheme introduced in
{yonss(p41)}.
We plan to expend some effort in describing a specific representation
for LISP expressions for two reasons. First, though abstraction
is a most desirable attribute, we must reconcile our abstraction
with reality; our programs must run. The second point is that
the representation we pick will have a very close relationship to the
way we present programs to the machine. So we will be careful and
thorough in describing the mapping and you should be conscientious
in your understanding of that mapping.
Once that representation is given we will produce a LISP
algorithm which describes the
evaluation process used in LISP.
This process of mapping LISP expressions onto S-exprs, and writing a LISP
function to act as an evaluator may seem a bit incestuous;
indeed, the rationale for doing any of this may not be apparent.
Patience please.
First, the mapping is no more obscure than the polynomial evaluation or
differentiation examples.
It is just another instance of the diagram of {yon(P46)},
only now we are applying the process to LISP itself. The effect is to force
us to make precise exactly what is meant by LISP evaluation. This
precision will have many important ramifications.
Second, we've been doing evaluation of S-expr representations of
LISP expressions already.
The %2great mother of all functions%* is exactly the evaluation mechanism for
the LISP primitive functions and predicates, %3car, cdr, cons, atom%* and %3eq%*
when restricted to functional composition and constant arguments.
The %2great mother revisited%* is the extension to conditional expressions.
In the next section we will give a typical mapping of LISP expressions
to elements of %d<sexpr>%*. However remember that we should attempt
to keep the knowledge of the representation out of the structure
of the algorithm.
But let's stop for some examples of translating LISP functions into S-expr notation.
.SS(S-expr translation of LISP expressions)
.BEGIN
We will go through the list of LISP constructs, describing the
effect of the representational map, %eR%*, and give a few examples
applying %eR%*.
.GROUP
%1we will represent numerals just as numerals, e.g.:
.BEGIN CENTERIT;
←%eR%f( %1<numeral> %f)%3 = %1<numeral>
←%eR%f( %32 %f)%3 = 2
.END
%1we will translate identifiers to their upper-case counterpart. Thus:
.BEGIN CENTERIT
←%eR%f( %1<identifier> %f)%3 = %1<literal atom>
←%eR%f( %3x %f)%3 = %3X
←%eR%f( %3y2 %f)%3 = %3Y2
←%eR%f( %3car %f)%3 = %3CAR
.END
.APART
.BEGIN FILL;
%1Now we've got a problem:
We wish to map arbitrary LISP expressions to S-expressions. The LISP expression
%3x%* translates to %3X%*; %3X%* is itself a LISP expression (a constant);
%2it%* must also have a translate.
We must be a little careful here.
When we write Son of Great Mother we will give it an S-expr representation
of a form to be evaluated. We might give it the representation of %3car[x]%*
in which case the value computed will depend on the current value bound to %3x%*.
We might also give the representation of %3car[X]%*; in this case we should
expect to be presented with an error message.
Or, for example some function %3foo%* we wish to write may return the S-expr
representation of a LISP form as its value.
Say %3foo[1]%* returns the representation of %3car[x]%* and %3foo[2]%* returns
the representation of %3car[X]%*. We must be able to distinguish between these
representations.
That is, given the representation,
there should be exactly one way of interpreting it as a LISP expression.
The mapping must be 1-1. So we must represent %3x%* and %3X%* as %2different%* S-exprs.
The translation scheme we pick is:
for any S-expression, %9α%*, its translation is %3(⊗→%3QUOTE%*↔← %9α%3)%1.
.END
.GROUP
.SELECT 1;
.BEGIN CENTERIT
←%eR%f( %1<sexpr> %f)%3 = %3(QUOTE %1<sexpr>%3)%1
.END
For example:
.BEGIN CENTERIT
←%eR%f( %3X %f)%3 = %3(QUOTE X)%1
←%eR%f( %3(A .B) %f)%3 = %3(QUOTE (A . B))%1
←%eR%f( %3QUOTE %f)%3 = %3(QUOTE QUOTE)%1
.END
.APART
.GROUP
%1
.BEGIN FILL;
We must also show how to map expressions of the form %3f[e%41%*#;#...#;e%4n%*]%1
onto S-exprs.
We have already seen one satisfactory mapping for functions in prefix form
in {yonss(P41)}.
We will use that mapping (called Cambridge Polish) here. That is:
.END
.BEGIN TURN ON "←";
←%eR%f( %3f[e%41%*;e%42%*; ...e%4n%*] %f)%3 =
( %eR%f( %3f %f)%3, %eR%f(%3 e%41%* %f)%3, %eR%f(%3 e%42%* %f)%3 ...%eR%f(%3 e%4n%* %f)%3 )%1
.END
.APART
.GROUP
%1Examples:%3
.BEGIN CENTERIT;
←%eR%f( %3car[x] %f)%3 = (%eR%f( %3car %f)%3, %eR%f( %3x %f)%3 ) = (CAR X)
←%eR%f( %3car[X] %f)%3 = (%eR%f( %3car %f)%3, %eR%f( %3X %f)%3 ) = (CAR (QUOTE X))
←%eR%f( %3cons[cdr[(A .B)];x] %f)%3 = (CONS (CDR (QUOTE (A . B))) X)
.END
.APART
.GROUP
%1The %eR%*-mapping must handle conditional expressions. A conditional is
represented as a list whose first element is %3COND%* and whose next %3n%*
elements are representations of the %3p%4i%*-e%4i%1 pairs. The %eR%*-map
of such pairs is a list of the %eR%*-maps of the two elements:
.BEGIN TURN ON "←";
←%eR%f( %3[p%41%* → e%41%*; ... p%4n%* → e%4n%*] %f)%3 =
(⊗→%3COND%*↔← %3(%eR%f(%3 p%41 %f)%3 %eR%f(%3 e%41%f)%3 ) ... (%eR%f(%3 p%4n%3 %f)%3 %eR%f(%3 e%4n %f)%3))
%1and an example:%3
←%eR%f( %3[atom[x] →1; q[y] → X] %f)%3 = (COND ((ATOM X) 1) ((Q Y) (QUOTE X)))
.END
%1
.APART
.END
.P44:
Notice that %3(COND#(#...#))%* and %3(QUOTE#...#)%* %2look%* like translates of function
calls of the form %3cond[#...#] %* and %3quote[#...#]%*. This is %2not%* the case;
%3QUOTE%* and %3COND%* are not translates of LISP functions.
The constructs %3QUOTE%* and %3COND%* do not use call-by-value evaluation and
are handled in a special way as we shall soon see.
Finally, the translates of the truth values, %et%1 and %ef%1 will be
%3T%* and %3NIL%*, respectively.
.BEGIN CENTERIT;
←%eR%f( %et%* )%3 = T
←%eR%f( %ef%* )%3 = NIL
.END
You might have noticed that these last two applications of the %eR%*-mapping
have the potential to cause trouble. They will spoil the 1-1 property of %eR%*:
.BEGIN CENTERIT;
←%eR%f( %3t%* )%3 = T
←%eR%f( %3nil%* )%3 = NIL
.END
The usual way to escape from this difficulty; is to outlaw %3t%* and
%3nil%* as LISP variables.
Perhaps our concern for the %eR%*-mapping's properties appears to
border on paranoia where a simple solution seems
apparent: %et%* is %et%* and %3t%* is %3t%*; when we
want the truth value we write %et%* and when we want the variable we
write %3t%*. The problem is that when we write programs in a format
which a machine version of LISP will understand, we will be writing
the %eR%*-image, rather than the LISP expression form. Thus to
ask a LISP machine to evaluate %3car[(A#.#B)]%* we present it
with %3(CAR#(QUOTE#(A#.#B)))%*. What this means is that we are presenting
our programs to the machine as data structures of the language.
It would be like expressing programs in Fortran or Algol as
arrays of integers; that is, the data structures of %2those%* languages.
We will explore the implications of this
approach to programming in later sections, but for now
it should help to know that we will be making extensive use
of the %eR%*-mapping.
You should now go back and look at the %3tgm%*'s ({yonss(P39)}) now
that you know that they are evaluators for simple subsets of LISP expressions.
Note that the only atoms which the great
mothers recognize are %3T%* and %3NIL%*. Any other atoms
elicit an error message.
What do these other atoms represent? That is what objects are atoms
the %eR%*-maps of? Well, numerals are atoms and are the %eR%*-maps of
numerals. We certainly could extend %3tgmoaf%* to handle this case.
Atoms are translates of another class of LISP expressions; they
are the translates of variables and function names.
So one task before us is to incorporate a mechanism into our simple LISP
evaluator which will handle evaluation of variables. We've already seen
the necessary mechanism in {yonss(P2)} where we studied tables as an abstract
data stucture. The other piece of LISP which did not appear in the evaluator
for polynomials was conditional expressions. Before handling conditionals
we wish to handle the informal intuitive discussion of tables in {yonss(P2)}
in a more precise manner.
.SS(Symbol tables,symbol tables,P92:)
%1
One of the distinguishing features of computer science is
its reliance on the ability to store and recover information.
Any formalism which addresses itself to computer science
must take this into account. In particular we must be able to
handle the effect of assignment or binding, that is, the association
of a value with a name in an environment. A common device used to accomplish this
is a symbol table. This is the device we used informally in {yonss(P2)}.
In essence, a symbol table is simply a set of ordered pairs
of objects (a function or relation, if you wish). One of the
elements of each pair is a name; the other is its value.
As an abstract operation, finding an element in a symbol table is
quite simple. Given a set of ordered pairs and a variable,
find a pair with first element the same as the given variable.
This level of abstractions is a bit too spartan.
The level of abstraction we wish to envision looks on a
symbol table as a %2sequence%* of pairs, each pair representing a variable
and its corresponding value.
The addition of sequencing seems minor, but its importance will become
apparent.
We will use the list-operations for examining elements of the sequence,
but we will need to designate a selector, %3name%*, to locate
the name-component of a pair; and a selector, %3value%*, to
retrieve the value component.
These simple symbol tables are also known as ⊗→association lists↔←
or %2⊗→a-lists↔←%*.
Thus %3assoc%* will be the name of our function to search a symbol table.
%3assoc%* will take two arguments: a name and a symbol table.
It will examine the table from left to right, looking for the first
pair whose %3name%*-part matches the given name. If such a pair is found,
then that pair is returned. If %2no%* such pair is found, the
result is undefined.
.BEGIN TABS 10,25;TURN ON "\";NOFILL;
.SELECT 3;
\assoc[x;l] <= \[\ eq[name[first[l]];x] → first[l];
\\ %et%* → assoc[x;rest[l]]]
.END
Notice there may be more than one pair with the same
%3name%*-part; only the %2first%* is found.
The ordering inherent in the searching algorithm will be utilized
very soon.
We will come back to symbol tables in {yonss(P128)} to study
the problems of efficient storage and retrieval of information.
It will suffice now simply to think of a symbol table as represented
in LISP by a list of dotted pairs, a name dotted with value.
In this representation, then, %3name[x] <= car[x]%*, and %3value[x]#<=#cdr[x]%*.
For completeness, we should also specify a constructor. Though we
won't need it for a while, it's name is ⊗→%3mkent%*↔←, and
will take a name and a value and return a new entry. Its representation here is
%3mkent[x;y]#<=#cons[x;y]%*.
.GROUP
.P110:
Recall that we are representing variables as atoms; then if %3x, y, %*and %3z%*
had current values %32, 3, %*and %34%*, then a symbol table describing that fact could
be encoded as:
←%3((X . 2)(Y . 3)(Z . 4)) .%1
.APART
.GROUP
.BEGIN CENTERIT;GROUP;
%1For example:
←%3assoc[Y;((X . 2)(Y . 3)(Z . 4))] = (Y . 3)
←assoc[U;((X . 2)(Y . 3)(Z . 4))] %1 is undefined.
.END
.APART
%1
How are we to represent bindings of variables to non numeric S-exprs? For example,
how will we represent: "the current value of %3x%* is %3A%*"?
We will place the dotted-pair %3(X . A)%* in the table. Now this
representation is certainly open to question: why not add
%3(X .(QUOTE A))%*? The latter notation is more consistent
with our conception of representation espoused on {yon(P46)}.
That is, we map LISP expressions to S-expressions; perform the
calculations on this representation, and finally %2reinterpret%*
the result of this calculation as a LISP expression. The representation
we have chosen for symbol tables obviates the last reinterpretation
step. Now it will turn out that for our initial subsets of LISP
this reinterpretation step simply would involve "stripping" the %3QUOTE%*s.
The only "values" which a computation can return are constants; later
things will become more difficult. Perhaps this representation of
table entries is a poor one, we will see.
In studying any existing language, or contemplating the design of
any new one, we must question each detail of representation. Decisions
made too early can have serious consequences.
However, in defense of LISP, it must be remembered that
LISP was conceived at the time that FORTRAN was believed to be a
gift from God. Only later did we learn the true identity of the donor.
Before moving on we should probably take stock of our current
position; in this section
we have simply recreated the table-lookup mechanism we used
in {yonss(P2)}, only now we are paying a bit more attention
to representation.
We can locate things in a table and we have seen how calling
functions can add values to a table. We have said nothing
about adding function definitions to the tables.
Abstractly we know how to extract the definition from the table and
apply it. We must give an explicit representation of the storage of a
function. This turns out to be a reasonably non-trivial problem.
We have seen that it is possible to mechanize at least one
scheme for evaluation of functions -- left-to-right and call-by-value.
We have seen that it is possible to translate LISP expressions
into S-exprs in such a way that we can write a LISP function
which will act as an evaluator for such translates. In the process
we have had to mechanize the intuitive devices we (as humans) might
use to recall the definition of functions and to recall the current
values of variables. It was soon clear that the same mechanism
could be used: that of symbol tables. To associate a variable with
a value was easy. To associate a function name with its definition
required some care. That is, part of the definition of a function
involves the proper association of formal parameters with the
body of the definition. (We actually need to be a bit more
precise than this in describing recursive functions, but this is
good enough for now.) The device we chose is called the %2lambda
notation%*.
.SS(%3λ%*-notation,%3λ%*-notation,P49:)
.BEGIN TURN ON "←";
Recall our discussion of the problems of representation of function
definitions. This discussion began on {yon(P124)} and our conclusion was
that to represent a definition like %3f[x;y]#<=#%9x%*[x;y]%1
we needed a symbol table entry with name %3f%* and a value part which
contained the body of the definition, %9x%3[x;y]%1, and the
list of arguments, %3[x;y]%*, given with %3f%*.
LISP uses the λ-notation to lend precision to our informal
discussion of function representation.
Why add more notation to the LISP pot?
The λ-notation is a device invented by the logician Alonzo Church in
conjunction with his studies in logic and foundations of mathematics.
The λ-calculus is useful for a careful discussion of functions
and is therefore applicable in a purified discussion of procedures in programming
languages. We shall begin a detailed discussion of the
λ-calculus and its relation to computer science in {yonss(P85)}.
The notation was introduced into Computer Science by John McCarthy in
the description of LISP.
In the interest of precision we should note that there are actually several
important distinctions between Church's λ-calculus and the notation envisioned
by McCarthy. We will point out the discrepancies in {yonss(P85)}.
As we intimated previously, the λ-notation as used in LISP, allows
us to attach names to functions in an unambiguous manner.
We have been very imprecise when writing %3f[x;y]#<=#[x*y#+#y]%* as
a definition of the function %3f%*.
First note that %3f[x;y]%* is %2not%* a function, %3f%* is.
To see what %3f[x;y]%* means consider the following example.
When we are asked to evaluate %3car[(A . B)]%* we say the value is %3A%*.
%3car[(A . B)]%* is an expression to be evaluated; it is a LISP ⊗→form↔←.
If %3car[(A . B)]%* %2is%* a form then so is %3car[x]%*;
only now the value of the
form depends on the current value assigned to the variable %3x%*.
So the %2⊗→function↔←%* is %3car%*; the %2form%* is %3car[x]%*.
The function is %3f%*; %3f[x;y]%* is a form, as is %3x*y#+#y%*.
Also our notation has really been specifying more than just the name.
The notation specifies the formal parameters (%3x%* and %3y%*) and
the order in which we are to associate actual parameters in a call
with the formal parameters of the definition (%3x%* with the first, %3y%* with the second).
More subtly, the notation tells %2which%* variables in the function body
are bound variables. That is, which variables are to expect values when the
function is called. For example define %3g[x]#<=#[x*y#+#y]%*; then the
value of %3g[2]%* is well defined and is itself a function.
We wish to have a notation to assign all this other information with the
function body so that function definitions can be inserted into the symbol
table as "values". They will be parametric values, but they will be values.
The λ-notation performs this task by preceding the function body with
a list of the bound variables, called %2⊗→lambda variables↔←%*.
The list of variables is usually referred to as the %2⊗→lambda list↔←%*.
Then this resulting construct is preceeded by "λ[" and followed by "]".
Thus using the above example, the name %3f%* is exactly the
same function as %3λ[[x;y]#x*y#+#y]%*.
We actually need a bit more than
λ-notation to specify recursive functions in a natural manner. See {yonss(P38)}.
The λ-notation introduces nothing new as far as our intuitive binding and
evaluation processes are concerned; it only makes these operations more
precise. So to evaluate a λ-expression, bind the evaluated actual parameters
to the λ-variables and evaluate the function body. One benefit of the
λ-notation is that we need not give explicit names to functions in order to
perform the evaluation. Evaluation of such anonymous functions should be
within the province of LISP.
For example evaluate:
←%3λ[[x;y] x%82%* + y][2;3] %1.
We associate %3x%* with %32%* and %3y%* with %33%* and evaluate the expression:
←%3x%82%* + y%1.
Assuming arithmetic works as usual, this calculation should give %37%* as value.
←Or evaluate: %3λ[[x] cdr[car[x]]][((A . B). C)]%*.
We bind %3x%* to the
S-expression %3((A#.#B).#C)%* and evaluate the function body. The usual
LISP evaluation scheme entails evaluating %3car[x]%* with the current
binding of %3x%*; this result, %3(A#.#B)%*, is passed to %3cdr%*.
%3cdr%* performs its calculation and finally returns %3B%*.
The λ-notation can be used anywhere LISP expects to find a function.
For example:
←%3λ[[x] first[x]][λ[[y] rest[y]][(A B)]]%1
This expression is a complicated way of writing:
←%3f[g[(A B)]]%* where %3f[x] <= first[x] %*and %3g[y] <= rest[y]%*.
Though the second form is perhaps easier for us to comprehend, the
first form %2is%* equivalent and will be acceptable to the evaluator we will
write. Indeed the mechanical evluation of the second formulation will pass
through the first on its way to complete evaluation. At any rate:
←%3λ[[x] first[x]][λ[[y] rest[y]][(A B)]] = λ[[x] first[x]][(B)] = B%1
Still, evaluation requires
the usual care. For example, is %3λ[[x]2]%* the constant function which always
gives value %32%*? No, it isn't; the evaluation of an expression (form) involving
this function requires the evaluation of the actual parameter associated with %3x%*.
That computation may not terminate. For example, consider %3λ[[x]2][fact[-1]]%*.
where %3fact%* is the LISP implementation of the factorial function given on
{yon(P20)}.
Since we intend to include λ-expressions in our language we must include
an %eR%*-mapping of them into S-expression form:
.BEGIN GROUP;centerit;
←%eR%f( %3λ[[x%41%*; ...; x%4n%*]%9x%*] %f)%3 = (LAMBDA (X%41%* ... X%4n%*) %eR%f( %9x%3 %f)%3)%1
.END
Thus the character, λ, will be translated to %3LAMBDA%*. %3LAMBDA%* is therefore a
kind of prefix operator (like %3COND%*) which will help the evaluator recognize
the occurrence of a function definition (just as %3COND%* will be used by the
evaluator to recognize the occurrence of a translate of a conditional
expression).
Here are some examples of %3λ%*-expressions and their %eR%*-translates:
.BEGIN CENTERIT;GROUP;
←%eR%f( %3λ[[x;y] x%82%* + y] %f)%3 = (LAMBDA(X Y)(PLUS (EXPT X 2) Y))
←%eR%f( %3λ[[x;y] cons[car[x];y]] %f)%3 = (LAMBDA(X Y)(CONS (CAR X) Y))
.END
To make our discussion of λ-expressions completely legitimate,
our LISP syntax equations should now be augmented to include:
.BEGIN TABIT1(11);GROUP;
<function>\::= λ[<varlist><form>]
<varlist>\::= [<variable>; ... ; <variable>]
\::= [ ]
.END
Besides giving a precise notation for storing function definitions, and
giving a means of working with anonymous functions, the λ-notation
is a useful computational device
in that
nebulous area called efficiency. Efficiency is like structured programming --
difficult to define but recognizable when it occurs.
Consider the following sketch of a function definition:
←%3g <= λ[[x][%9p%*[lic[x]] → lic[x]; .... x ...]]%1,
where %3lic%* may be a %dl%*ong %di%*nvolved %dc%*alculation.
We certainly must compute %3lic[x]%* %2once%*. But as %3g%* is defined,
we would compute %3lic[x]%* %2twice%* if p%41%* is true: once
in the calculation of p%41%*; once as e%41%*. Since
both calculations of %3lic[x]%* will give the same value⊗↓Our
current LISP subset has no side effects. That means
there is no way for a computation to affect its surrounding
environment. The most common construct which has a side-effect
is the assignment statement.←, this second calculation
is unnecessary. %3g%* is inefficient. We could write:
←%3g <= λ[[x][f[lic[x];x]]%1
where:
←%3f <= λ[[u;v][%9p%*[u] → u; .... v ...]]%1.
.P170:
In this scheme %3lic%* %2will%* only be evaluated once; its value
will be passed into %3f%*.
Using λ-expressions, in a style
called %2⊗→internal lambdas↔←%* we can improve %3g%* without
adding any new function names to our symbol tables as follows:
Replace the body of %3g%* with,
%aLAM← %3λ[y][%9p%*[y] → y; ... x ...][lic[x]]%1.
Call this new function %3g'%*:
.BEGIN TABIT2(10,18);GROUP;
\%3g' <= λ[[x]
\\λ[y][%9p%*[y] → y; ... x ... ][lic[x]]
\\ ] %1.
.END
Now when %3g'%* is called we evaluate the actual parameter, binding it
to %3x%*, as always; and evaluate the body, %aLAM%*. Evaluation of %aLAM%*
involves calculation of %3lic[x]%* %2once%*, binding the result to
%3y%*. We then evaluate the body of the conditional expression as before.
If p%41%* %2is%* true, then this definition of %3g'%* involves
one calculation of %3lic[x]%* and two table look-ups (for the value of %3y%*),
rather than the two calculations of %3lic[x]%* in %3g%*.
More conventional programming languages can obtain the same effect as
this use of internal lambdas by assignment of %3lic[x]%* to a
temporary variable. We will introduce assignment statements in LISP in
{yonss(P37)}.
.END
.CENT(Problems);
I What is the difference between %3λ[[ ] x*y + y]%* and %3x*y + y%*?
****more probs****
.SS(Mechanization of evaluation,,P6:)
%1
We now have picked a representation for LISP expressions; have
introduced a precise notation for discussing functions; we
have given plausibility arguments for the existence of an evaluator for
LISP. It is now time to write a formal, precise, evaluator for LISP expressions.
The evaluator will be the final arbiter on the question of the
meaning of a LISP construct. The evaluator is thus a very important
algorithm. We will express it and its related functions in as
representation-free form as possible, but we will always have
our Cambridge⊗↓Massachusetts, not England.← polish
representation in the back of our minds.
As we have said, %3tgmoaf%* and %3tgmoafr%* ({yonss(P39)}) are evaluators
for subsets of LISP.
Armed with our symbol-table mechanism we could now extend the
%2great-mothers%* to handle variable look-ups.
Rather than do this we will display our burgeoning cleverness
and make a total revision of the structure of the
evaluators. In making the revision, the
following points should be remembered:
1. Expressions (to be evaluated) can contain variables, both simple
variables and variables naming λ-expressions. Thus evaluation must be done
with respect to an environment or symbol table. We wish to recognize other
(translates of) function names besides %3CAR, CDR, CONS, EQ%* and%3 ATOM%*
in our evaluator, but
explicitly adding new definitions to the evaluator in the style of the
recognizers for the five primitives is a bad idea. Our symbol table should
hold the function definitions and the evaluator should contain the general
schemes for finding the definitions, binding variables to values, and
evaluating the function body. By now you should be convinced that this
process is a reasonably well defined task and could be written in LISP.
2. All %2function%* calls are to be evaluated by-value. However there are
some %2⊗→special forms↔←%* we have seen which are not evaluated in the
normal manner. In particular, conditional expressions, quoted expressions,
and lambda expressions are handled differently.
The primary function is named ⊗→%3eval%*↔← rather than %3sogm%* (Son of Great Mother).
It will take two arguments; the first
is a representation of an expression to be evaluated, and the second
is a representation of a symbol table. %3eval%* will recognize numbers, the
constants %3T%* and %3NIL%*, and if presented with a variable, will attempt to
find the value of the variable in the symbol table using %3assoc%* ({yon(P92)}).
%3eval%* also handles the special forms %3COND%* and %3QUOTE%*. If %3eval%* sees
a ⊗→conditional expression↔← (represented by %3(COND ...)%* ) then the body
of the ⊗→%3COND%*↔← is passed to a subfunction named ⊗→%3evcond%*↔←.
%3evcond%* encodes the conditional expression semantics as described on {yon(P40)}. The special
form, ⊗→%3QUOTE%*↔←, signifies the occurrence of a constant. Simply return
that constant.
As far as this %3eval%* is concerned, any other expression is a function-call.
The argument-list evaluation is handled by %3evlis%* in the authorized
left-to-right ordering. This is handled by recursion on the sequence of arguments.
In this case we %2apply%* the function
to the list of evaluated arguments.
This is done by the function %3apply%*.
With this short introduction we will now write a more general evaluator
which will handle a larger subset of LISP than the %3tgm%*s.
Here's the new %3eval%*:
.BEGIN SELECT 3;GROUP;TABIT1(7);
eval <= λ[[exp;environ]
\[isvar[exp] → value[assoc[exp;environ]];
\ isconst[exp] → denote[exp];
\ iscond[exp] → evcond[exp;environ];
\ isfunc+args[exp] → apply[func[exp];evlis[arglist[exp];environ];environ]]
.APART
.GROUP
%1and:%*
denote <= λ[[exp]
\[isnumber[exp] → exp; %1(see footnote {yon(P80)})%3
\ istruth[exp] → exp;
\ isfalse[exp] → exp;
\ isSexpr[exp] → rep[exp]
\]]
.GROUP
%1where:%*
evcond <= λ[[e;environ][eval[ante[first[e]];environ] → eval[conseq[first[e]];environ];
\ %et%* → evcond[rest[e];environ]]]
.APART
.GROUP
%1and,%*
evlis <= λ[[e;environ][null[e] → NIL;
\ %et%* → concat[eval[first[e];environ];evlis[rest[e];environ]]]]
.END
The selectors, constructors and recognizers which relate this abstract
definition to our particular S-expression representation are grouped with
%3apply%* on {yon(P108)}.
The subfunctions, %3evcond%* and %3evlis%*, are simple. %3evcond %*you've seen
before in %3tgmoafr%* in a less abstract form;
%3evlis%* simply manufactures a new list consisting of the
results of evaluating (from left to right) the elements of %3e%*, using the
symbol table, %3environ%*, where necessary.
The left-to-right sequencing should be apparent both in %3evlis%* and %3evcond%*.
A more subtle application of the sequence property occurs
within %3apply%* in the symbol table
search and construction process. We know that %3assoc%* looks from left to right
for the latest binding of a variable. Thus the function which %2augments%* the
table must add the latest binding to the %2front%*. When do new bindings
occur? They occur at λ-binding time, and at that time the function %3pairlis%* will
build an augmented symbol table with the λ-variables bound to their evalauted
arguments.
⊗→%3apply%*↔← takes three arguments: a function, a list of evaluated arguments,
and a symbol table. %3apply%* explicitly recognizes the five primitive functions %3CAR,
CDR, CONS, EQ, %*and%3 ATOM%*. If the function name is a variable,
the
definition is located in the symbol table by %3eval%* and applied to the arguments.
Otherwise the function must be a λ-expression. This is where things
get interesting. We know what we must do: evaluate the body of the
λ-expression after binding the formal parameters of the λ-expression
to the evaluated arguments. How do we ⊗→bind↔←? We add variable-value
pairs to the symbol table. We will define a subfunction, %3pairlis%*, to
perform the binding. Then all that is left to do is give the function
body and the new symbol table to %3eval%*. Here is %3apply%*:
.BEGIN SELECT 3;GROUP;TABIT1(15);
.P69:
apply <= λ[[fn;args,environ]
\[iscar[fn] → car[arg%41%*[args]];
\ iscons[fn] → cons[arg%41%*[args];arg%42%*[args]];
\ .... ....
\ isvar[fn] → apply[eval[fn;environ];args;environ];
\ islambda[fn] → eval[body[fn];pairlis[vars[fn];args;environ]]
\ ]]
pairlis <= λ[[vars;vals;environ][null[vars] → environ;
\%et%* → concat[mkent[first[vars];first[vals]];pairlis[rest[vars];rest[vals];environ]]]]
.END
Some of the functions and predicates which will relate these abstact definitions to our
specific S-expression representation of LISP constructs are:
.BEGIN TABIT3(5,33,53);GROUP;SELECT 2;
.P108:
\Recognizer\Selector\Constructor
.SELECT 3;
iscar <= λ[[x]eq[x;CAR]]\func <= λ[[x]first[x]]\mkent <= λ[[x;y]cons[x;y]]
isSexpr <= λ[[x]eq[first[x];QUOTE]]\arglist <= λ[[x]rest[x]]
istruth <= λ[[x] eq[x;T]]\body <= λ[[x]third[x]]
\\vars <= λ[[x]second[x]]
.END
So the only really new development is in the λ-binding process.
Notice that %3pairlis%* makes a new symbol table by consing new pairs onto
the front of the old symbol table; and recall that %3assoc%* finds the
%2first%* matching pair in the symbol table. %2This is important.%*
.P86:
To summarize then, the evaluation of an expression %3f[a%41%*; ...a%4n%*]%1,
where the a%4i%*'s are S-exprs,
is the same as the result of applying %3eval%* to the %eR%*-translation,
%3(%eR%f(%3#f#%f)%3,#%eR%f(%3#a%41#%f)%3,#...#%eR%f(%3#a%4n#%f)%3).%1
This behavior is again an example of the diagrams of {yon(P46)}. In its
most simple terms, we mapped LISP evaluation onto the LISP %3eval%* function;
mapped LISP expressions onto S-expressions; executed %3eval%*. Notice
that in this case we do not reinterpret the output since the structure of the
representation does this implicitly. We have commented on the efficacy
of this already on {yon(P110)}.
This specification of the semantics of LISP using %3eval%* and %3apply%*
is one of the most interesting developments of Computer Science.
.CENT(Problems)
I Relate our version of %3eval%* and %3apply%* with the version given in the appendix.
Though the current version is much more readable, how much of it %2still%* depends
on the representation we chose. That is how abstract is it really?
II. Complete the specification of the selectors, constructors, and recognizers.
.SS(Examples of %3eval%*,examples of %3eval%*)
After this onslaught, some examples are in order.
We will demonstrate the inner working of the evaluation algorithm
on a couple of samples and will describe the flow of control in the
execution in a couple of different ways.
The examples will be done in terms of the image of the %eR%*-mapping
rather than being done abstractly. We do this since the structure of an
actual LISP evaluator will use this representation⊗↓Recall we will
be programming in the %eR%*-image.←.
It is important that
you dilligently study the sequence of events in the execution
of the evaluator. The process is detailed and tedious but it must be done
once so that you convince yourself of its correctness.
Let's evaluate %3f[2;3]%* where %3f <= λ[[x;y] x%82%* + y].%1
That is:
←%3eval[ %eR%f(%3 f[2;3] %f)%3; %eR%f(%3 <f, λ[[x;y] +[↑[x;2]; y]]> %f)%3]%1.
After appropriate translation this is equivalent to evaluating:
←%3eval[(F 2 3);((F . (LAMBDA(X Y)(PLUS (EXPT X 2)Y))))]%*
%2Notes:%*
%21.%* %3((F . (LAMBDA (X Y) ... ))) = ((F LAMBDA (X Y) ... )) %*
%22.%* Since the symbol table, %3((F ...))%* occurs so frequently in the following
trace, we will abbreviate it as %3st%*. Note that we have no mechanism yet
for permanently increasing the repertoire of known functions. We must
resort to the subterfuge of initializing the symbol table to get the function
%3f%* defined.
%23.%* For this example we must assume that + and ↑ (exponentiation) are known functions. Thus
%3apply%* would have to contain recognizers for %3PLUS%* and %3TIMES%*:
.BEGIN NOFILL;
.GROUP
%3
...atom[fn] → [isplus[fn] → +[arg%41%*[args];arg%42%*[args]];
isexpt[fn] → ↑[arg%41%*[args];arg%42%*[args]];
...
.APART
%1
.END
.BEGIN NOFILL;TABS 10,25;TURN ON "\";
.P129:
.GROUP
So %3eval[(F 2 3);st]
\= apply[func[(F 2 3)]; evlis[arglist[(F 2 3)];st];st]
\= apply[F;evlis[(2 3);st];st]
\= apply[F;(2 3);st]
\= apply[eval[F;st];(2 3);st]
\= apply[(LAMBDA(X Y)(PLUS (EXPT X 2) Y)); (2 3);st]
\= eval[body[(LAMBDA(X Y)(PLUS (EXPT X 2) Y))];
\ pairlis[vars[(LAMBDA(X Y)(PLUS (EXPT X 2) Y))];(2 3);st]]
\= eval[(PLUS (EXPT X 2) Y);pairlis[(X Y);(2 3);st]]
\= eval[(PLUS (EXPT X 2) Y);((X . 2)(Y . 3)(F LAMBDA(X Y) ...))]
\= apply [PLUS; evlis[((EXPT X 2) Y);((X . 2)(Y . 3)..)];((X . 2)...)]
\ %1Let's do a little of:%3 evlis[((EXPT X 2)Y);((X . 2)(Y . 3)...)]
\\= concat[eval[(EXPT X 2);((X . 2)(Y . 3) ...)];
\\ evlis[(Y);((X . 2) ...)]]
\\= concat[apply[EXPT;evlis[(X 2);((X . 2)...)];((X . 2) ...]
\\ evlis[(Y); ...]]
\\= concat[apply[EXPT;(2 2);((X . 2) ...];evlis[(Y); ...]]
\\= concat[↑[arg%41%*[(2 2)];arg%42%*[(2 2)]];evlis[(Y); ... ]]
\\= concat[↑[2;2];evlis[(Y); ... ]]
\\= concat[4;evlis[(Y);((X . 2)(Y . 3) ....)]]
\\= concat[4;concat[eval[Y;((X .2) ...)]; evlis[( );(( ...))]]]
\\= concat[4;concat[3;( )]]
\\= (4 3) %1 which you knew anyway.
Now back to apply:
%3
\= apply[PLUS;(4 3);((X . 2)(Y . 3) ... )]
\= +[4;3]
\= 7 %1which you also knew.
.APART
.END
It should now be clear that %3eval%* and friends %2do%* begin to
perform as you would expect. It perhaps is not clear that a
simpler scheme might not do as well. In particular, the complexity
of the symbol table mechanism which we claimed was so important
has not been exploited. The next example will indeed show that a
scheme like ours is necessary to keep track of variable bindings.
.GROUP
Let's sketch the evaluation of %3fact[3]%* where:
←%3fact <= λ[[x][x = 0 → 1;%et%* → *[x;fact[x-1]]]].%1
.P42:
That is, %3eval[(FACT 3);st] %*where %3st%* names the initial symbol table,
←%3((FACT LAMBDA(X)(COND((ZEROP X)1)(T(TIMES X(FACT (SUB1 X))))))).%1
.APART
In this example we will assume that the binary function, *, the
unary predicate, %3zerop#<=#λ[[x]x#=#0]%* and unary function,
%3 sub1#<=#λ[[x]#x-1]%* are known and are
recognized in the evaluator by %3TIMES, ZEROP%* and %3SUB1%* respectively.
.BEGIN NOFILL;TURN ON "\";TABS 10,20,30;
.GROUP
%1Then%3 eval[(FACT 3);st]
\= apply[FACT;evlis[(3);st];st]
\= apply[(LAMBDA(X)(COND ...));3;st]
\= eval[(COND((ZEROP X)1)(T( ...)));((X . 3) . st)]
\= evcond[(((ZEROP X)1)(T(TIMES X(FACT (SUB1 X)))));((X . 3) . st)]
%1(Now, let %3st1%* be%3 ((X . 3) . st) )
\= eval[(TIMES X(FACT (SUB1 X))); st1]
\= apply[TIMES;evlis[(X (FACT (SUB1 X))); st1];st1]
\= apply[TIMES;concat[3;evlis[((FACT (SUB1 X))); st1];st1]
%1Now things get a little interesting inside %*evlis:
\evlis[(((FACT (SUB1 X)));st1]
\\= cons[eval[(FACT (SUB1 X)); st1];NIL]
\\%1 and %* eval[(FACT (SUB1 X));st1]
\\\= apply[FACT;evlis[((SUB1 X));st1];st1]
\\\= apply[FACT; (2);st1]
\\\= apply[(LAMBDA (X)(COND ...));(2);st1]
\\\= eval[(COND((ZEROP X) 1) ...));((X . 2) . st1)]
\\\...
.APART
.FILL
%1
Notice that within this latest call on %3eval%* the symbol-table-searching function,
%3assoc%*, will find the pair %3(X . 2)%* when looking for the value of %3x%*. This is
as it should be. But notice also that the older binding, %3(X . 3)%*, is still
around in the symbol table %3st1%*, and will become accessible once we complete
this latest call on %3eval%*.
.GROUP
As the computation continues, the symbol table is proceeding initially from,
.NOFILL
%3
←st =((FACT LAMBDA(X)(COND ...))) %1to%3,
←((X . 3) . st) %1to%*,
←((X . 2)(X . 3) . st) %1to%*,
←((X . 1)(X . 2)(X . 3) . st) %1to finally%*,
←((X . 0)(X . 1)(X . 2)(X . 3) . st).
%1
.FILL
.APART
Since the search function, %3assoc%1, always proceeds from left to right through
the table and since the table entry function, %3pairlis%*, always %3cons%*es pairs
onto the front of the table before calling %3eval%*, we will get the correct
binding of the variables.
This symbol table mechanism is very important, so let's look at it
again in a slightly different manner.
.END
.<<the | | symbol table >>
.P43:
We will represent calls on %3eval%* informally, using LISP expressions
rather than the %eR%*-image. We represent the symbol table between vertical
bars, "|", in such a way that if a table, t%41%*, is:
.BEGIN TABIT2(10,17);GROUP
\| b%4n%*\|
\| ...\| then %3cons%*ing a new element, b%4n+1%* onto t%41%* gives:
\| b%41%*\|
\| b%4n+1%*\|
\| b%4n%*\|
\| ...\|
\| b%41%*\|
.END
%3
.BEGIN NOFILL;TABS 10,46,58;TURN ON "\";
%1Then:%*
eval[`fact[3]'; |fact = λ[[x][x=0 → 1;%et%* → *[x;fact[x-1]]]]|]
.GROUP
\= eval[`λ[[x][x=0 → 1; %et%* → *[x;fact[x-1]]]];\|x = 3 | ]
\\|fact = λ[[ ... |
.APART
.GROUP
\= *[3;eval[`λ[[x] ....'; \| x = 2 |]
\\| x = 3 |
\\|fact = λ[[ ...] |
.APART
.GROUP
\= *[3; *[2;eval[`λ[[x] ... ';\| x = 1 |]
\\| x = 2 |
\\| x = 3 |
\\|fact = λ[[ ... |
.APART
.GROUP
\= *[3; *[2; *[1;eval[`λ[[x] ... ';\| x = 0 |]
\\| x = 1 |
\\| x = 2 |
\\| x = 3 |
\\|fact = λ[[ ... |
.APART
.GROUP
\= *[3; *[2; *[1;1]]] %1with:%*\| x = 1 |
\\| x = 2 |
\\| ... |
.APART
.GROUP
\= *[3; *[2;1]] %1with:%*\| x = 2 |
\\| ... |
.APART
.GROUP
\= *[3;2] %1with:%*\| x = 3 |
\\| ... |
\= 6. %1with:%*\|fact = λ[[ ... |.
= 6
.END
%1
Notice that after we went to all the trouble to save the old values
of %3x%* we never had to use them. However, in the general case of
recursive evaluation we must be able to save and restore the old values
of variables.
For example recall the definition of %3equal:
.BEGIN NOFILL TURN ON "\";TABS 17;
.GROUP
%3
equal <= λ[[x;y][atom[x] → [atom[y] → eq[x;y];%et%* → %ef%*];
\ atom[y] → %ef%*];
\ equal[car[x];car[y]] → equal[cdr[x];cdr[y]];
\ %et%* → %ef%*]
.APART
%1If we were evaluating:%3
←equal[((A . B) . C);((A . B) . D)],
%1
then our symbol table structure would be changing as follows:
.END
%3
.BEGIN NOFILL;TABS 10,40;TURN ON "\";
.GROUP
\|equal = λ[[x;y] ...| ==>\|x = ((A . B) . C)| ==>
\\|y = ((A . B) . D)|
\\|equal = λ[[x;y]...|
.APART
.GROUP
\| x = (A . B) |\| x = A |
\| y = (A . B) |\| y = A |
\| x = ((A . B). C) | ==>\| x = (A . B) |
\| y = ((A . B). D) |\| y = (A . B) | ==>
\|equal = λ[[x;y ... |\| x = ((A . B). C) |
\\| y = ((A . B). D) |
\\|equal = λ[[x;y] ...|
.APART
.GROUP
\| x = B |\| x = C |
\| y = B |\| y = D |
\| x = (A . B) |\| x = ((A . B). C) | ==>
\| y = (A . B) | ==>\| y = ((A . B). D) |
\| x = ((A . B). C) |\|equal = λ[[x;y] ... |
\| y = ((A . B). D) |
\|equal = λ[[x;y] ... |
←|equal = λ[[x;y] ... |
.END
%1
This complexity is necessary, for while we are evaluating
%3equal[car[x];car[y]]%*, we rebind %3x%* and %3y%* but we must save the old
values of %3x%* and %3y%* for the possible evaluation of %3equal[cdr[x];cdr[y]].%*
But the claim is that using %3pairlis%* to cons the new
bindings onto the front of the symbol table as we call %3eval%*
does the right thing. That is, in %3apply%* ({Yon(P69)}):
%3
←islambda[fn] → eval[body[fn];pairlis[vars[fn];args;environ].
%1
The tricky part is that when we leave that particular call on
%3apply%*, the old table is automatically restored by the recursion
mechanism. That is, if %3st%* is the current symbol table then
%3cons%*-ing things onto the front of %3st%* doesn't change %3st%*, but if we
call %3eval%* or %3apply%* with a symbol table of say:
←%3concat[(X . 2);concat[(X . 3); st]] %*
then in %2that%* call on %3eval%* or %3apply%* we have access to %3x = 2%*, not
%3x = 3%*.
.P93:
Before moving on we should examine %3eval%* and %3apply%* to see how they
compare with our previous discussions of LISP evaluation.
The spirit of call-by-value and conditional expression evaluation is maintained.
λ-binding seems correct, though our current discussion is not complete.
At least one preconception is not maintained here. Recall the discussion on
{yon(P95)}. We wanted n-ary functions called with exactly n arguments. An
examination of the structure of %3eval%* and %3apply%* shows that
if a function expecting %2n%* arguments is presented with less, then
the result is undefined; but if it is given %2more%* arguments than necessary
then the calculation is performed. For example:
.BEGIN TABIT1(33);GROUP;SELECT 3;
eval[(CONS(QUOTE A)(QUOTE B)(QUOTE C));NIL]
\= eval[(CONS(QUOTE A)(QUOTE B));NIL]
\= (A . B)
.END
This shows one of the pitfalls in defining a language by an evaluator.
If the intuitions of the language specifiers are faulty or incomplete
then we are faced either with maintaining that faulty judgement as if nothing
were wrong; or we must lobby for a "revised report". Neither alternative
says much for the field of language design.
.CENT(Problems)
%2I%* Which parts of %3eval%* and Co. allow correct evaluation functions
applied to too many arguments?
%2II%* On {yon(P93)} we noted that the evaluator performs "correctly"
when evaluating forms like %3cons[A;B;C]%*.
Can you find other anomalies in %3eval%* and friends?
.REQUIRE "PROB8.PUB" SOURCE_FILE;
.SS(Free variables,free variables,P135:)
.BEGIN TURN ON "#";
Let's look more closely at λ-binding in %3eval%*.
The scheme presented seems reasonable,
but as in the case %3cons[A;B;C]%*, there may be more going on here
than we are perhaps aware of.
Consider the function: %3f <= λ[[x]x#+#y]%*. If we asked %3eval%*
to compute %3f[2]%*, it would complain. It would happily
find that %3x%* is %32%* but would find no value for
%3y%*. However, if we asked it to evaluate the form %3λ[[y]f[2]][1]%*
it %2would%* work. It would find the value of %3y%* to be
%31%* and would get a final answer of %33%* as expected. You should convince
yourself of this assertion.
The variable %3y%* within the definition of %3f%* has a different
character than that of %3x%*. To pursue this further we need to make some
definitions.
The λ-variable %3x%* in %3λ[[x]x#+#y]%* is called a
%2⊗→bound variable↔←%*. Bound variables are also cursed with the
ungracious name of "dummy" variables.
We can say that an occurrence of a variable %3x%* is bound if it
appears within a λ-expression whose list of λ-variables contains %3x%*,
or is in fact a λ-variable. Occurrences of variables which are not
bound are called free occurrences.
Such variables are called %2⊗→free variable↔←%*s.
Thus in the definition of %3f%*, %3y%* occurs free; and in the body of
%3f%* which is %3x#+#y%*, both %3x%* and %3y%* occur free.
Consider the expression
.BEGIN CENTERIT;SELECT 3;
←λ[[y]h[x; λ[[x]car[x]][(A . B)] ]]
.END
The first occurrence of %3x%* is free; the next two occurrences are
bound. Both %3h%* and %3car%* occur free.
One of the difficulties in programming languages
is deciding what value to associate with a free variable.
It is clear %2how%* values get associated; it happens through λ-binding.
The binding strategy for λ-variables is reasonably
uniform in programming languages; bind some form of the actual parameter⊗↓
evaluated or unevaluated← to the dummy variable and evaluate the body of the
%3λ%*-expression.
While we are evaluating the body of the function the λ-variables are
free, but they do have associated values thanks to the binding process.
The difficulty is that frequently there will be choices of values to
associate.
The scheme which LISP uses for discovering the value of any variable is to
proceed linearly down the symbol table, looking for the %2first%* binding.
This scheme is called %2⊗→dynamic binding↔←%*. It %2usually%* results
in uncovering the value that is expected. But not always.
Conceptually, the dynamic binding scheme corresponds to the physical
replacement of the function call with the function body and then an
evaluation of the resulting expression.
We first wish to develop a useful notation for describing
bindings before delving further into the intracacies of binding strategies.
That discussion will be the content of {yonss(P134)}.
.END
.BEGIN GROUP;
.BEGIN SELECT 2;CENTERIT;
←Problems
.END
I Write a LISP predicate, %3free <= λ[[x;e]..]%*, which will give %et%*
just in the case that %3x%* and %3e%* represent a variable and a %9λ%*-expression
respectively, and %3x%* is "free" in %3e%*.
.END
.SS(Environments and bindings,,P77:)
.<< the Weizenbaum symbol table>>
This section will introduce one more notation
for describing symbol tables or environments. This notation, due to
J. Weizenbaum, only shows the abstract structure of the symbol table manipulations
during evaluation. Its simplicity will be of great benefit when we introduce
the more complex binding schemes necessary for function-valued functions.
See {yonss(P76)}.
In the previous discussions it has been sufficient
simply to think of a symbol table as a sequence pairs; each pair was a variable
and its associated value. First, we survived because we dealt only with
λ-variables; that is, we ignored the possibility of free variales.
As long as we added the λ-bindings to the
%2front%* of the sequence representing the symbol table we showed that the
correct evaluation would result.
Even after we recognized the existence of %2free%* variables, we survived
because of LISP's generous dynamic binding scheme.
However with the advent of free variables, it is convenient and soon
will be necessary, to examine the structure of environments more closely.
The evaluation of a typical function-call will involve
the evaluation of the arguments, the binding of the λ-variables
to those values, the addition of these new bindings to the front of the
symbol table,
and finally the evaluation of the body of the function.
That segment of the symbol which we have just added by the λ-binding will be called
the %2local symbol table%* or local environment. The variables
which appear in that segment are called %2⊗→local variable↔←s%*.
The remainder of the symbol table makes up the %2global table%* and the
variables which appear in the global table but not in the local table
are called %2⊗→global variable↔←s%*.
Notice that
variables which are local to a form-evaluation are those which were
present in the λ-binding; they were the bound variables.
Global variables correspond roughly to free variables.
We will describe our environments in terms of a local symbol table
augmented by a description of where to find the global values.
Thus instead of having one amorphous sequential symbol table, we
envision a sequence of tables. One is the local table, and its
successor in the sequence is the previous local table. The
information telling where to find the previous table is called the
%2⊗→access chain↔←%*.
Thus if tables are represented by E%4i%1 and the access chaining
by %d→%* then we might represent a symbol table as:
.BEGIN CENTERIT;
←%3(%1E%4n%d → %1E%4n-1%d → %1 ... %1E%41%3)
.END
where %1E%4n%1 is the local or current segment of the table.
LISP thus finds local bindings in the local
table and uses the access chain to find bindings of global variables.
If a variable is not found in any of the tables, then it is undefined.
Now to establish some notation,
an environment will be described as :
.BEGIN TABIT2(36,40);GROUP;
\\%aForm%*
\\E%4local%*
\\|
\ \| E%4i%*
\_________
\var\| value
\%3v%41%1\| %aval%41%1
\%3v%42%1\| %aval%42%1
\ .....
\%3v%4n%1\| %aval%4n%1
\\|
.END
%aForm%* is the current form being evaluated.
E%4local%* is the name of the current environment or symbol table.
Let %3x%* be a variable appearing in %aForm%*. If %3x%* is not found
among the %3v%4i%1's,
then entries in the table named E%4i%* are examined.
If the variable is not found
in E%4i%* then the environment mentioned in the upper right-hand quadrant
of E%4i%* is searched. The search will terminate if the variable is found;
the value is then the corresponding %aval%4i%1.
If %3x%* is not found locally, and the symbol "/" appears in
the right-hand quadrant, then %3x%* is undefined.
The notation is used as follows: when we begin the evaluation of a
form, an initial table E%40%* is
set up with "/" in its access field.
The execution of a function definition, say %3f#<=#λ[[x;y]x%82%*#+#y]%1, will
add an appropriate entry to the table, binding %3f%* to its lambda definition.
Next, consider the evalaution of the form %3f[2;3]%*.
When the λ-expression is entered, i.e.,
when we bind the evaluated arguments (%32%* and %33%*) to the
λ-variables (%3x%* and %3y%*), a new local table (E%41%*) is set up
with an access link of E%40%*.
Entries reflecting the binding of the λ-variables are made in E%41%* and
evaluation of the λ-body is begun.
The flow of symbol tables is:
.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,41,46,49,54,57,60,65,70,73;
.GROUP
\\%3f <= ...\\\f[2;3]\\\x%82%* + y%1
\\E%40%*\\\E%40%*\\\E%41%*\\\E%40%*
\ \| /\\ \| /\\ \| E%40%*\\ \| /
\______\=>\______\=>\______\=>\______
\\|\ \ %3f\| λ[[x;y]x%82%* + y]%1\ \%3x\| 2\ \f\| λ[[x;y] ...
\\\\\\\ y\|3%1
Compare this sequence to the example on {yon(P129)}.
.END
.BEGIN GROUP;
That is, the sequence of tables corresponds to the evaluation sequence:
.BEGIN CENTERIT;GROUP;turn off "{","}";
←%3eval[{ %eR%f( %3f<= λ[[x;y] x%82%* +y] %f)%3;%eR%f( %3f[2;3] %f)%3}; ()]
←↓
←%3eval[%eR%f( %3f[2;3] %f)%3; %eR%f( %3<f , λ[[x;y] x%82%* +y]> %f)%3]
←↓
←%3eval[%eR%f( %3x%82%* + y %f)%3; %eR%f( %3<x, 2>, <y, 3>, <f , λ[[x;y] x%82%* +y]> %f)%3]
←↓
←%37%1
.END
.END
.FILL;
.<<RED'S FACT>>
.GROUP
The execution of %3fact[3]%* on {yon(P42)} results in a more interesting
example. The following discussion should be read in conjunction with that
description⊗↓The layout of this example is due to R. Davis.←.
.P104:
.BEGIN TURN ON "\";NOFILL;TABS 9,14,24,29,34,44,49,54,65,67;preface 0 mills;
.TURN OFF "←";
\\\\\\\\\%B⊃%*
\\\\\\\\\%Bεα→ %36%*
\\%3fact[3]\\\[x=0→ ...]\\\*[x;fact[x-1]]\%bε←⊃%*
\\%1E%40%*\\\E%41%*\\\E%41%*\%b~ ~%*
\ \| /\\ \| E%40%*\\ \| E%40%*\%b$ ↑%*
\_______\=>\_______\=>\_______ =>\\%32%*
\%3fact\| λ[[x][x=0→1;...]\ x\| 3\\ x\| 3\\%b↑%*
\\\\\\\\\%B⊃ ~%*
\\\\\\\\\%Bε→$%*
\\%3fact[2]\\\[x=0→ ...]\\\*[x;fact[x-1]]\%b~%*
\\%1E%41%*\\\E%42%*\\\E%42%*\%bε←⊃%*
\ \| E%40%*\\ \| E%41%*\\ \| E%41%*\%b$ ↑%*
=>\_______\=>\_______\=>\_______ =>\\%31%*
\%3 x \| 3\\ x\| 2\\ x\| 2\\%b↑%*
\\\\\\\\\\%B~%*
\\\\\\\\\%B⊃ ~%*
\\\\\\\\\%Bε→$%*
\\%3fact[1]\\\[x=0→ ...]\\\*[x;fact[x-1]]\%b~%*
\\%1E%42%*\\\E%43%*\\\E%43%*\%bε←⊃%*
\ \| E%41%*\\ \| E%42%*\\ \| E%42%*\%b$ ~%*
=>\_______\=>\_______\=>\_______ =>\\%b~%*
\%3 x \| 2\\ x\| 1\\ x\| 1\\%b↑%*
\\\\\\\\\\%31%*
\\\\\\\\\\%B↑%*
\\\\\\\\\\%B~%*
\\%3fact[0]\\\[x=0→1; ...]\\\\%b⊃ ~%*
\\E%43%*\\\E%44%*\\\\%b~ ↑%*
\ \| E%42%*\\ \| E%43%*\\ \%1send%*\%b~ ~%*
=>\_______\=>\_______\=> 1\\ %31 \%bε→$%*
\%3 x \| 1\\ x\| 0\\ \%1back up\%b$%*
.END
At the end of the first line we are faced with the evaluation of
%3*[x;fact[x-1]]%*. This requires the evaluation of the arguments
to *; this is done by %3evlis%* and requires the evaluation of %3fact[x-1]%*
using the environment E%41%*. In E%41%* %3x-1%* hopefully gives %32%*, and
we find the definition of %3fact%* in E%40%*. At the beginning of line two
we set up E%42%* and evaluate %3fact[2]%*. Analogous situations
occur until line four; at this time we suddenly find ourselves in
E%44%* with %3x%* bound to %30%*. The predicate %3x=0%* is satisfied and
we start back up the right margin to conclude the nested evaluations
of %3*[x;fact[x-1]]%*. This process finally terminates at the
top returning with a value %36%*.
.APART
Notice in this example we will get the correct binding of %3x%* locally.
It is important to note that the occurrence of %3fact%* within the body of
the definition of %3fact%* is %2free%*! We find the correct binding
for %3fact%* by searching the access chain.
.GROUP
As a final example showing access to ⊗→free variable↔← bindings consider:
%3f[3]%* where: %3f#<=#λ[[x]g[2]]%* and %3g#<=#λ[[y]x+y]%*.
.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,41,46,49,54,57,60,65,70,73;
.GROUP
\%3f <= ...; g <= ...\\f[3]\\\g[2]\\\x + y ....
\\%1E%40%*\\\E%40%*\\\E%41%*\\\E%42%* ....
\ \| /\\ \| /\\ \| E%40%*\\\| E%41%*
\______\=>\______\=>\______\=>\______\ ......
\\|\\ %3f\| λ[[x] g[2]]\\%3x\| 3\ \y\| 2 ...
\\\\%3g\| λ[[y] x+y]
.END
Notice that when we attempt to evaluate %3x + y%* we find %3y%* has a local
value, but we must look down the access chain to find a binding for %3x%*.
.APART
The scheme for using Weizenbaum environments for the current LISP subset
should be clear.
.BEGIN INDENT 5,5;GROUP;
When doing a λ-binding, set up a new E%4new%* with
the λ-variables as the local variable entries and the values of the
arguments as the corresponding value entries. In this interpretation, "<="
is just another λ-binding. The access slot of the new E%4new%* points
to the previous symbol table. The evaluation of the body of the
λ- expression takes place using the new table; when a local variable
is accessed we find it in E%4new%*; when a global variable occurs, we
chase the access chain to find its value.
When the evaluation of the form is completed, E%4new%* disappears
and the previous environment is restored.
.END
Obviously you should convince yourself that the current access- and binding-scheme
espoused by LISP is faithfully described in these diagrams.
.CENT(Problems)
%21.%* By now you should realize that environments really are a class of abstract
data structures: they should have constructors selectors and recognizers.
To help discover what a set of such functions might be, write a new version
of the symbol table searching function which operates on Weizenbaum environments.
The function should search the local table first, then if the variable is not
found there, use the access chain information to search further.
After writing the abstract version of the function give a representation
for the environments and give a set of functions which you think would
be useful in manipulating environments.
.SS(%3label%*,%3label%*,P38:)
One effect of placing "λ" and a list of λ-variables in front
of an expresson is to bind those variables which appear both in
the λ-list and the expression. All other variables
appearing in the expression are %2⊗→free variables↔←%*.
For example, %3f%* is free in the following:
←%3f <= λ[[x][zerop[x] → 1; %et%* → *[x;f[x-1]]] ].%1
Clearly our intention is that the %3f%* appearing the the right of "<="
is the same %3f%* appearing to the left of "<=". That is we want %3f%* to be
a bound variable and not free.
This has not been a problem for us. We have simply pre-loaded the
symbol table, binding %3f%* to its definition (or value). See {yon(P42)}. LISP
has been equipped with a slightly more elegant device for this binding.
It is called the %3label%* operator. It is called thus:
←%3label[%1<identifier>;<function>]
and has exactly the effect of binding the <identifier> to the <function>.
Thus the effect of evaluating a %3label%*-expression is to generate a function.
For example, a proper definition of %3fact%* is:
←%3label[fact; λ[[x][eq[x;0] → 1; %et%* → *[x;fact[sub1[x]]]]]]%1
To include %3label%* in the LISP syntax add:
.BEGIN TABIT1(11);CENTERIT;
←<function>::= %3label%*[<identifier>;<function>]
and the S-expr translation of the %3label%* construct should naturally be:
←%eR%f( %3label[f;fn] %f)%3 = (LABEL %eR%f( %3f %f)%3, %eR%f( %3fn %f)%3)%1
Note that %3label%* should be a special form.
.END
A typical application of %3label%*, say %3label[f;λ[[x]%9x%*[x]]][A]%1
results in the following environmental picture when we get ready to evalute
%9x%3[x]%1:
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\%3label[f;λ[[x]%9x%*[x]]][A]%1 \\%9x%3[x]%1
\\E%40%*\\\E%41%*
\ \/| /\\ \| E%40%*
\______________\=> ... \______
\ \ | \\%3f\| λ[[x]%9x%*[x]]
\ \ | \\x\|A
.END
Thus within the evaluation of the body %9x%3[x]%1
recursive calls on %3f%* will send us to environment E%41%1 and we
will resurrect the definition of %3f%* as needed.
.BEGIN TURN ON "#";
With the introduction of %3label%* we can talk more precisely about "<=".
When we write "what is the value of %3f[2;3]%* when %3f#<=#λ[[x;y]#...#]%*?"
we mean: evaluate the form %3[label[f;#λ[[x;y]#...]][2;3]]%*. If %3f%*
is not recursive, then the %3label%* application is unnecessary. The
anonymous function %3λ[[x;y]#...#]%* applied to %3[2;3]%* suffices.
What about statements like "evaluate %3g[A;B]%* where
%3g#<=#λ[[x;y]#...#f[u;v]#...]%* and %3f#<=#λ[[x;y]#...#]%1."?
%3label%* defines only one function; we may not say %3label[f,g;#...#]%*.
What we %2can%* do embed the %3label%*-definition for %3f%* within
the %3label%*-definition for %3g%*⊗↓Indeed %2every%* occurrence of %3f%*
must be replaced by the %3label[f;...]%1construct.←. Thus:
.BEGIN CENTER;SELECT 3;
label[g;#λ[[x;y]#...#label[f;#λ[[#...#]#...#][u;v]#...]%*
.END
The perspicuity of such constructions is minimal. Needless to say,
implementations of LISP offer better definitional facilities.
.END
The apparent simplicity of the %3label%* operator is partly due to
misconception and partly due to the restrictions placed on the current
subset of LISP. %3label%* appears to be a
rather weak form of an assignment statement. When we extend LISP to include
assignments we can easily show that such interpretation of %3label%* is
insufficient;
when we talk about a mathematical interpretation of LISP
we will show that %3label%* really requires careful analysis.
.CENT(Problems);
I. Show one way to change %3eval%* to handle %3label%*.
II. Express the definition of %3reverse%* given on {yon(P97)} using
%3label%*.
.BEGIN GROUP;
III Evaluate the following:
.SELECT 3;CENTERIT;
←λ[[y]label[fn;fn%42%*][%ef%*]] [%ef%*]
where:
←fn%42%* <= λ[[x][y → 1; x → 2; %et%* → fn%41%*[%et%*]]
←fn%41%* <= λ[[y]fn[y]]
.END
.SS(Functional Arguments and Functional Values,functional argument,P76:)
.BEGIN TURN ON "#","←";
Recall our discussion of :
%3eval[(F#2#3);((F#.(LAMBDA(X#Y)(PLUS#(EXPT#X#2)Y))))].%*
We now know this is equivalent to:
%3eval[((LABEL#F#(LAMBDA(X#Y)(PLUS#(EXPT#X#2)#Y)))#2#3);NIL]%1.
In either case, the effect is to bind the name %3f%* to the λ-expression.
Binding is also going on when %3f%* is called: we bind %32%* to %3x%*, and %33%*
to %3y%*.
In the latter case we are binding simple values; in the former we are binding
functions as values. We have decided that the necessary ingredients to
characterize a functional value
⊗↓It would be better
to call these constructs procedure-values since we will
take a decidedly algorithmic interpretation of them. As we now
know there are important differences between functions and
algorithms.←
are a representation of the formal parameters, and a representation of the
expression described in the body of the function.
In this section we will examine the adequacy of that decision.
We will proceed formally with a few examples and see what happens.
Assume we have a list %3l%* of elements %9α%41%1#...,#%9α%4n%1, and we
wish to form a new list of the form %3(car[%9α%41%3]#...%3car[%9α%4n%3])%1.
That is apply %3car%* to each of the elements of %3l%*. Such a function is
easy to write:
←%3carfirst <= λ[[l][null[l] → ( ); %et%* → concat[car[first[l]];carfirst[rest[l]]]]].%1
Now suppose we wish to write a more general function, which instead
of being specific to %3car%*, will take an %2arbitrary%* unary function %3f%* and
apply it to each of the elements of %3l%*,
generating %3(f[%9α%41%3],#...,#%3f[%9α%4n%3])%1.
Such a function could plausibly be defined as follows:
.P34:
←%3mapfirst <= λ[[fn;l][null[l] → ( ); %et%* → concat[fn[first[l]];mapfirst[fn;rest[l]]]]].%1
Thus the first calculation we requested above could be expressed as:
←%3mapfirst[car;l]%1 .......or could it?
Recalling LISP's penchant for call-by-value evaluation, we should realize
that the computation would not be done as expected. We do%2not%* want the
argument %3car%* evaluated. We want its %2name%*. We have one artifact in LISP
to subdue evaluation: we can make it a constant. Indeed,
←%3mapfirst[CAR;l] %2will%1 work.
You should convince yourself that %3mapfirst[CAR;l]%* will compute %3carfirst[l]%*.
Before going on to more complex examples it would be well to note that
%3mapfirst%* is a different kind of LISP function that those we have seen before.
Namely the first argument to %3mapfirst%* is expected to be a function; notice
that the argument %3fn%* appears in the body of %3mapfirst%* in a position reserved
for functions. Thus any actual parameter bound to %3fn%* is expected to be a function
Such a use of a function is called a %2functional argument%*.
Now the trick we used above of representing the functional argument as a constant
can be applied to arbitrary functional arguments. Thus the functional argument:
←%3λ[[x]f[g[x]]%* could be represented as,
←%3(LAMBDA(X)(F(G X))).%*
The difficulty is that the trick⊗↓The trick is typicaly called %3QUOTE%*-ing
the functional argument.← is not sufficient to capture the intended meaning
in all cases.
Instead we introduce a new operator called ⊗→%3function%*↔←. Thus:
←%3function[λ[[x]f[g[x]]] %* with an %eR%*-image of,
←%3(FUNCTION(LAMBDA(X)(F(G X)))).%*
Thus for the above example %3mapfirst[function[car];l]%* is the correct construction.
To understand what the the effect of the %3function%*-construct should be we need
a slightly more complex set of examples. First we try:
←%3mapfirst[function[λ[[x]concat[x; ( )]]];(A B C D)] = ((A)(B)(C)(D)).%*
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
%3mapfirst[function[λ[[x]concat[x;()]]; ..]\[null[l] ...%1
\\E%40%*\\\E%41%*
\ \| /\\ \| E%40%*
\______________\=>\______\=> . . .
\%3mapfirst\| λ[[fn;l][null[l]...]]\\l\| (A B C D)
\\\\fn\| λ[[x]concat[x;()]%1
.END
Now since %3null[l]%* is false, we reduce the problem to:
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\%3concat[fn[first[l]];mapfirst[fn;rest[l]]].%1
\\E%41%*
\ \| E%40%*
\______________
\%3l\| (A B C D)
\fn\|λ[[x]concat[x;()]%1
.END
Since we're still doing call-by-value we have to evaluate the arguments
to %3concat%*; that requires evaluating %3fn[first[l]]%*. The value of
%3l%* we find locally and evaluate %3first[l]%*, getting %3A%*. The value
for %3fn%* is also found locally, and since it is a λ-definition, we set up a
new environment in which to evaluate the body of %3fn%*, binding the λ-variable
%3x%* to %3A%*:
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\%3concat[x;()]%1
\\E%42%*
\ \| E%41%*
\______________
\%3x\| A%1
.END
The expected evaluation takes place: %3(A)%* is computed and returned to
environment E%41%1 so that we may continue the evaluation %3mapfirst[fn;rest[l]%*.
However, consider the following variant of this last example.
Define:
←%3foo <= λ[[l]mapfirst[function[λ[[x]concat[x;l]]]; (A B C D)]%*.
It would
seem that %3foo[(#)]%* would also give %3((A)(B)(C)(D))%*
since %3l%* will be bound to %3(#)%* and therefore the %3l%* in the functional
argument will effectively be %3(#)%*.
Things should work it seems.
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\ %3foo[( )]\\mapfirst[function[λ[[x]concat[x;l]]; ..]\[null[l] ...
\\%1E%40%*\\\E%41%*\\\E%42%*
\ \/| /\\ \| E%40%*\\ \| E%41%*
\______________\=>\______\=>\______ =>
\ %3foo\| λ[[l]... \\l\| ( ) \\l\| (A B C D)
\mapfirst\| λ[[fn;l][null[l]...]]\\\ \\fn| λ[[x]concat[x;l]%*
.END
%3null[l]%* is false since %3l%* is %3(A B C D)%*, so we evaluate
%3concat[fn[first[l]#...#]%*. This involves evaluating %3first[l]%*
in E%42%1, giving %3A%*. We evaluate (lookup) %3fn%* in E%42%1 and
finding a λ-definition, we make a new environment E%43%1 in which to evaluate
the body of %3fn%*. As we make E%43%1, we add an entry binding %3x%* to %3A%*
and we (gasp!) settle down in E%43%1 to evaluate %3concat[x;l]%*:
.BEGIN TABIT2(5,10);GROUP;
\\%3concat[x;l]%1
\\E%43%*
\ \| E%42%*
\__________
\ %3 x\| A
.END
Since %3l%* is global to E%43%1, we follow the access chain to find its value
in E%42%1 to be %3(A B C D)%1. But that's not the expected value! We expected
to find %3(#)%*, which was hidden away in E%41%1.
The trouble here is that
%3l%* was rebound in the interim. The first thing to note is that the problem
is caused by free variables; %3l%* is free in the functional argument.
Next note that the desired binding for %3l%* is the one which was current
when we were binding the functional argument to the formal parameter %3fn%*.
A plausible solution then is to replace all free variables with their values
at the time we recognize the %3function%*-construct. This will not suffice.
See problem on {yon(P151)} for a counterexample. A sufficient solution is to
associate the name of the current environment with the function and use
that pair as the value to be given to the formal parameter.
When we want to apply the functional argument we set up a new environment as
always, introducing a local table with the λ-variables bound to their values;
only %2now%* we use the saved environment as the beginning of the access chain.
Then the values of
any free variables which we encounter in the process of applying the functional
argument will be searched for in the saved environment.
Thus in the example we would recognize the %3function%*-construct while
evaluating the arguments to %3mapfirst%*; the environment which was current
then was E%41%*. Therefore as we build E%42%* we want to associate the pair
%3λ[[x]concat[x;l]#-#%1E%41%* with the formal parameter %3fn%*. Whenever
we apply %3fn%* we want to use %3λ[[x]concat[x;l]]%*; and within that
context, whenever we want %3l%*, we want the value of %3l%* in E%41%*.
Now to formalize this discussion:
First, this function-environment pair is called a %2⊗→closure↔←%* or
%2⊗→funarg↔←%*.
In our diagrams we will designate the pair as:
←<function>:<environment>.
Thus in the above example we should designate the value of the functional argument
as:
←%3λ[[x]concat[x;l]]:%1E%41%1.
We must also extend the manipulation of Weizenbaum environments to
handle such constructions.
The process which recognizes λ-definitions and sets up new evironments must now
watch for funargs.
When it sees one it uses the associated
environment as the access environment.
Let's do the example again.
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\ %3foo[( )]\\mapfirst[function[λ[[x]concat[x;l]]; ..]\[null[l] ...
\\E%40%*\\\E%41%*\\\E%42%*
\ \| /\\ \| E%40%*\\\| E%41%*
\______________\=>\______\=>\______ => . . .
\ %3foo\| λ[[l]... \\l\| ( ) \\l\| (A B C D)
\mapfirst\| λ[[fn;l][null[l]...]]\\\| \\fn\| λ[[x]concat[x;l]%1:E%41%1
.END
Things are as before except now the funarg pair is
bound to %3fn%* in E%42%*.
We lookup %3fn%* in E%42%1 and
finding a λ-definition, we make a new environment E%43%1 in which to evaluate
the body of %3fn%*. As we make E%43%1, we add an entry binding %3x%* to %3A%*.
But now since the λ-definition is a funarg we make the access environment
E%41%1 as saved with %3fn%*.
Thus we settle down in E%43%1 to evaluate %3concat[x;l]%*:
.BEGIN TABIT2(5,10);GROUP;
\\%3concat[x;l]%1
\\E%43%*
\ \| E%41%*
\__________
\ %3 x\| A
.END
Since %3l%* is global to E%43%1, we follow the access chain to find its value
in E%41%1 to be %3(#)%1 as desired.
Thus instead of simply tracing back to the previous environment we detour
around E%42%*:
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
.TURN ON "?" FOR "\";
.TURN OFF "\","←";
.TABS 30,33;
?∞1E∞40∞g
?↑
?~
?∞1E∞41∞g←?⊃
?↑?.
?~?.
?∞1E∞42∞g?↑
?↑?.
?~?.
?∞1E∞43∞g→?$
.END
Before toasting our brilliance, we must sorrowfully note that we are not
finished yet! There is still some information which we must make explicit
if these Weizenbaum diagrams are to faithfully represent the process of evaluation.
Namely, after we have finished the evaluation of %3concat[x;l]%1 we are to
restore a previous environment. Which one is it? It isn't E%41%*,
it's E%42%1! That information is not available in our diagram, so
clearly we must correct the situation.
The solution is clear. In the left-hand quadrant of our diagram we
place the name of the environment which we wish restored when we leave the
current environment.
That environment name will be called the %2⊗→control environment↔←%*;
and will head a chain of environments, called the control chain
⊗↓In Algol, the access chain is called the static chain, and the control
chain is called the dynamic chain.←.
Now we can relax. Here's the correct picture:
.BEGIN TABIT2(5,10);GROUP;
\\%3concat[x;l]%1
\\E%43%*
\E%42%*\| E%41%*
\__________
\ %3 x\| A
.END
So after we have finished computations in E%43%* we return control to E%42%*.
Thus the general structure of an environment is as follows:
.BEGIN TABIT2(33,42);GROUP;
\\E%4current%*
\\|
\E%4control%1\| E%4access%*
\______________________________
\var\| value
\%3x%41%1\| . . .
\%3x%42%1\| . . .
\. . .\| . . .
\%3x%4n%1\| . . .
\\|
.END
Here's another example involving a function to produce the composition
of two unary functions. We will call the function %3compose%* The value
returned by %3compose%* will be a function. Thus %3compose%* will produce
functional values. Thus:
.BEGIN CENTERIT;SELECT 3;GROUP;
←compose[car;cdr] = cadr ⊗↓%1to be completely honest we should write %3function[car]%*
and %3function[cdr]%*←
%1with a plausible definition as:%3
←compose <= λ[[f;g]λ[[x]f[g[x]]]
.END
This definition of %3compose%* is almost right. As you might have noticed, the
value returned by %3compose%* is to be a function. Indeed it is an instance of
a %2⊗→functional value↔←%*, and as with functional aguments, it needs to be
decorated with %3function%* so that the environment which contains the right
bindings for %3f%* and %3g%* is saved. Which environment is that? It's the
environment which will be current when the %3function%*-construct is recognized.
So we write:
.BEGIN CENTER;
.p152:
←%3compose <= λ[[f;g]function[λ[[x]f[g[x]]]].
%1Now try: %3app[cons[A;(B . C)];compose[car;cdr]]%1;
where: %3app <= λ[[y;f]f[y]]%1
.END
As usually we should evaluate the arguments to %3app%*, bind the results
to %3y%* and %3f%* and evaluate the body of %3app%*.
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\%3app[cons[A;(B .C)];compose[car;cdr]]
\\%1E%40%*
\ \/| /
\______________
\ %3app\ | λ[[y;f]f[y]]
\compose\ | λ[[f;g]function[λ[[x]f[g[x]]]]]%1
.END
Evaluation of the first argument to %3app%* brings no suprises; we get
%3(A#.(B#.#C))%1. We begin evaluating the second argument;
we find the definition of %3compose%* in the environment and
since it is a λ-definition we set up a new environment, E%41%*, and evaluate
the body %3function[λ[[x]f[g[x]]]]%*:
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\%3function[λ[[x]f[g[x]]]]%*
\\%1E%41%*
\ E%40%*\| E%40%*
\ __________
\ %3f\| car
\ g\| cdr%1
.END
Again, the recognition of the %3function%*-construct says return a funarg-pair
as value. The environment we associate is the current one, E%41%*. We now go back
to E%40%*, chasing the control chain. Since both arguments to %3app%* are now
evaluated, we find the definition of %3app%* and set up a new environment E%42%*.
Thus:
.BEGIN TURN ON "\";NOFILL;TABS 5,13,28,33,36,51,56,59,64,67,70;GROUP;
\\%3f[y]\\\f[g[x]]%1
\\E%42%*\\\E%43%*
\ E%40%*\| E%40%*\\ E%42%*\| E%41%*
\ __________\=>\_________
\ %3y\| (A. (B . C))\\x\| (A. (B . C))
\ f\| λ[x]f[g[x]]%1:E%41%1
.END
The form to be evaluated in E%42%* is %3f[y]%*; we find %3y%* and %3f%*
both locally. Since %3f%* is a λ-definition, we set up a new environment
binding the λ-variable %3x%* to the value %3(A#.(B#.#C))%*. But the λ-definition
is also a funarg; therefore the access environment stored in E%43%* is E%41%*.
The control component of E%43%* is set to the prior environment, E%42%*; and
we begin evaluation of the body %3f[g[x]]%*.
Now in E%43%* we find %3x%* locally but have to resort to the access chain to
find %3f%* and %3g%*; using funargs, we have set up the appropriate
environments. From E%43%* we have access to E%41%*⊗↓and from there to E%40%*
if it were needed.←:
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
∞1E∞40∞G
/\
/ \
/ \
∞1E∞41∞G ∞1E∞42∞g
. \
. \
.∞1E∞43∞g
.END
The rest of the evaluation goes without a hitch: we finish the evaluation
in E%43%* and return to E%42%* and finally to E%40%* following the control
evironments.
The problems with functional arguments and functional values arise
from occurrences of
free variables. %3f%1 and %3g%* in %3compose%* are free variables
and therefore their bindings are not to be found in the local environment.
Since the interesting applications of such functions usually involve
free variables, we must deal with them. In particular, the %3label%* operator
will typically involve free variables. In:
←%3label[fact; λ[[x][eq[x;0] → 1; %et%* → *[x;fact[sub1[x]]]]]]%1
the occurrences of %3fact%* are free. Indeed, the behavior of %3label%* in
{yonss(P38)} must be modified slightly to form a funarg pair, binding
the name to the body of the definition. For example,
recall the use of Weizenbaum environments on {yon(P104)} to show
the evaluation of %3fact[3]%1. That must now be modified to attach
the appropriate environment to the λ-definition.
.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,31,36,39,44,47,50,55,60,63;
.GROUP
\\%3fact[3]\\\fact[2]\\\fact[1]\\\fact[0]\\\x%1
\\E%40%*\\\E%41%*\\\E%42%*\\\E%43%*\\\E%44%* ...etc.
\ \| /\\E%40%*\| E%40%*\\E%41%*\| E%40%*\\E%42%*\| E%40%*\\E%43%*\| E%40%* ...
\______\=>...\______\=>...\______\=>...\______\=>...\______
\%3fact\| λ[[x] ...:%1E%40%3\ x\| 3\\ x\| 2\\ x\| 1\\ x\| 0 ...%1
.END
What does this say about functions? We have already remarked that functions are
parametric values; to that we must add that functions are also tied to the environment
in which they were created; they cannot be evaluated %3in vacuo%*.
.P89:
What does this say about "<="? It %2still%* appears to act like an assignment statement.
It is taking on more distinct character since it must associate environments with
the function body as it does the assignment.
The device LISP used to associate environments with functions is called
the ⊗→%3FUNARG%*↔← hack. (More is said about implementations of %3FUNARG%* in {yonss(P3)}.)
When %3eval%* sees the construction %3(FUNCTION %1fn%3)%* it returns as value
the list:
←%3(FUNARG###%*fn### current-symbol-table%3)%*.
When %3eval%*, or actually %3apply%* sees %3(FUNARG %*fn st%3)%*, that is,
when we are %2calling%3 fn%1, we use the symbol table %3st%*, rather than
the current symbol table for accessing global variables.
Thus there are %2two%* environments involved in the proper care and feeding of functional
arguments. First there is the environment which is saved with the %3FUNARG%*.
This is called the %2⊗→binding environment↔←%* since it is the
environment current at the time the functional argument was constructed
or bound. The second environment, called the %2⊗→activation environment↔←%*
is the environment which is current when the functional argument is %2applied%*
or activated. Thus the activation environment is used to locate local
variables, but if a non-local variable is needed then the binding
environment is selected.
It is the duty of %3eval%* and %3apply%* to use the %3FUNARG%* device
to maintain the proper control of the activation and binding environments.
.<<pic of funarg execution>>
.P79:
Let's follow the behavior of an %3eval%* and %3apply%* family which
has been modified to handle functional arguments.
Let %3fn <= λ[[x%41%*; ... ;x%4n%*] ...]%1 be a function which will be used
as a functional argument in the context, say:
.BEGIN SELECT 3; CENTERIT;GROUP;
←fie <= λ[[ ...;foo ...] .... foo[ ...] ]
←fie[ ...; function[fn]; ...]
.END
in the sequel %3st%* and %3st%4save%1 will name symbol tables. "→" will mean
"points to" (i.e.,#%3cons%*). p%4i%*'s will be dotted pairs in a symbol table.
Then let's see what calling %3fie[#...;#function[fn];#...]%* will do.
.BEGIN TABIT3(10,50,52);GROUP
\%3(FIE ... (FUNCTION FN) ...) st:%bαααα→ . . .%1\%bα→π→αααα→%1p%4n%* → p%4n-1%* → ...p%41%1⊗↓%3st%1
contains previous bindings and the function definitions.←
\%b ....
\%1computation\\%b↑%*
\%3(FUNCTION FN)\\%b~%*
\%1 gives:\\%b↑%*
\%3(FUNARG FN st%4save%1:%b→αααα→ ...\%bα→λ%*
\ ....\\%b↑%*
\%1computation %3st%1 : p%4m%* → p%4m-1⊗↓%1Note that one of the
p%4i%*-pairs will be %3(FOO . (FUNARG FN %1st%4save%3)).%1←%bαα→ . . .\%bα→λ%*
\%3(FOO %1a%41%* ... a%4n%*)\\%b↑%*
\%1 gives:\\%b~%*
\%3((FUNARG FN st%4save%*)%1a%41%* ... a%4n%3)\\%b~%*
\\\%b~%*
.BEGIN noFILL;
%1The a%4i%*'s are evaluated in the context %3st%* %2not%* %3st%4save%1\\%b↑%*
by %3evlis[#...;#st]%* giving v%4i%1's.\\%b~%*.
\\\%b~%*
We then apply %3fn%* to the v%4i%*'s in the context %3st%4save%1\\%b~%*
%2not%* in environment %3st%* by calling \\%b↑%*
%3apply[#...#;#...#;st%4save%*].\\%b~%*
\\\%b~%*
%1This results in:\\%b~%*
\\\%b↑%*
%3eval[body[fn]; %3((x%41%1 . v%41%3) → ... (x%4n%1 . v%4n%3)%1 → . . . \%b→α$%*###]
.END
.END
After we finish this inner call on %3apply[#...#;#...#;st%4save%*]%1 the table %3st%*
is restored. Notice that our symbol tables are now really tree-like rather than
stack-like.
It should be apparent from %3eval%* that %3(QUOTE#...)%*
and %3(FUNCTION#...)%* will have the
same effect if %3fn%* has no global (or free) variables.
This seems like a lot of work to allow a moderately obscure construct to appear in a language.
However constructs like functional arguments appear in several programming languages
under different guises. Usually the syntax of the language is sufficiently
obfuscatory that the true behavior and implications of
devices like functional arguments is misunderstood. Faulty implementations
usually result. In LISP the problem %2and the solution%* appear
with exceptional clarity⊗↓Indeed LISP was the first language to allow functional
values.←.
Functional arguments may be exploited to describe very general control structures.
We will say more about this application later.
Finally, here is a sketch of the abstract structure of the current %3eval%*.
.BEGIN SELECT 3;GROUP;TABIT1(12);
eval <= λ[[exp;environ]
\[isvar[exp] → value[exp;environ];
\ isconst[exp] → denote[exp];
\ iscond[exp] → evcond[exp;environ];
\ isfun[exp] → makefunarg[exp;environ];
\ isfunc+args[exp] → apply[func[exp];evlis[arglist[exp];environ];environ]]
.APART
.GROUP
%1where:%*
apply <= λ[[fn;args,environ]
\[isfunname[fn] → ....
\ islambda[fn] → eval[body[fn];newenviron[vars[fn];args;environ]]
\ isfunarg[fn] → apply[func%41%*[fn];args;evn[fn]]
\ .... ..... ]]
.APART
.END
Now for some specific examples.
In particular, most implementations of LISP include a very useful class
of ⊗→mapping functions↔←.
.BEGIN INDENT 0,10;
⊗→%3maplist%*↔← is a function of two arguments, a list, %3l%*, and a functional
argument, %3fn%*. %3maplist%* applies the function %3fn%* (of one argument)
to the list %3l%* and its tails (%3rest[l], rest[rest[l]],#..%1)
until %3l%* is reduced to %3(#)%*.
The value of %3maplist%* is the list of the values returned by %3fn%*.
Here's a definition of %3maplist%*:
.END
.GROUP;
←%3maplist <= λ[[fn;l][null[l] → ( ); %et%* → concat[fn[l];maplist[fn;rest[l]]]]]%1.
Thus:
%3←maplist[REVERSE;(A B C D)] = ((D C B A)(D C B)(D C)(D))%*.
.APART;
An interesting and non trivial use of functional arguments
is shown on {yon(P136)} where we define a new control structure
suitable for describing algorithms built to operate on lists.
.CENT(Problems)
I. What changes should be made to the LISP syntax equations to
allow functional arguments?
.P151:
II. Use %3app%* on {yon(P152)} to define a function which computes factorial without
using %3label%* or explicit calls on the evaluator.
III. Extend %3eval%* and friends to functional arguments.
.GROUP;
.P99:
IV. An interesting use of functional arguments involves ⊗→self-applicative functions↔←.
An application of a function %3f%* in a context %3f[...,f,...]%* is an instance
of self application ⊗↓provided the designated argument position is a functional
argument←. Self-applicative functions can be used to define recursive
functions in such a way that the definition is not %2statically%* self-referential,
but is %2dynamically%* re-entrant.
For example here is our canonical example, written using self-applicative functions:
.APART
.BEGIN CENTER;SELECT 3;GROUP;
fact <= λ[[n]f[function[f]; n]]
f <= λ[[g;n][n=0 → 1; %et%* → *[n; g[g; n-1]] ]]
.END
Use Weizenbaum's environments to show the execution of %3fact[2]%*.
V. To see why %3QUOTE%*-ing is not sufficient, evaluate %3foo%9'%*[( )]%1,
where:
.BEGIN CENTERIT;
←%3foo%9'%3 <= λ[[l]mapfirst[(LAMBDA (X) (CONCAT X L)); (A B C D)]%1.
.END
and compare the result to the computation of %3foo[( )]%1.
.END
.SS(Binding strategies,binding strategy,P134:)
After the discussion of free variables beginning in {yonss(P135)} and the
intervening discussions of environments, it should now be clear
that the root of the binding problem is free variables. We cannot completely
outlaw free variables. Any interesting recursive algorithm has free variables.
.BEGIN TURN ON "←";
For example, %3f%* is free in the right hand side of the following:
←%3f <= λ[[x][zerop[x] → 1; %et%* → *[x;f[x-1]]] ].%1
.END
We don't want to restrict their use too precipitously since they
are a very useful programming technique. For example the possible alternative
of passing all global information through as extra parameters in calling
sequences is overly expensive. Also functional arguments and
functional values are an extremely powerful construct which we'd rather
not relinquish.
Using environments we can now give a computational interpretation to dynamic
binding. We can now see that it is simply the technique of chasing the
access chain. The only perturbation of this scheme occurs when we have
functional arguments or functional values.
Handling of global variables varies from programming language to programming
language. The solution advocated by Algol-like languages is called %2⊗→static
binding↔←%* and dictates that all global references be fixed in the binding
environment. LISP at least gives you a choice; using %3QUOTE%* you will get
the dynamic binding on free variables in a functional argument; using
%3function%* gives the static interpretation.
There are no questions about Algol's interpretation of functional values;
this construction is not allowed. When we discuss implementation
we will see why.
.SS(special forms,special form,P8:)
We have remarked that the evaluation scheme for LISP functions is
call-by-value and, for functions with multiple arguments, left-to-right
evaluation of arguments. We have also seen, in %3QUOTE%* and %3COND%*,
that not all forms to be evaluated in LISP fall within this category.
We have already noted on {yon(P44)} that %3QUOTE%* and %3COND%* are
%2not%* translates of functions. Clearly we don't evaluate the arguments to
%3(QUOTE ...)%*; indeed the purpose of %3QUOTE%* was to %2stop%* evaluation.
Also the `arguments' to %3COND%* are handled differently; their evaluation was
handled by %3evcond%*.
We therefore have called %3QUOTE%* and %3COND%* %2special forms%*.
We would like to discuss special forms as a generally useful technique.
Consider the predicates, ⊗→%3and%*↔← and ⊗→%3or%*↔←. We might wish to define
%3and%* to be a binary predicate such that %3and%* is true just in case
%2both%* arguments evaluate to %et%*; and define %3or%* to be binary
and false just in case both arguments evaluate to %ef%*.
Notice two points. First, there is really no reason to restrict these predicates
to be %2binary%*. Replacing the words "binary" by "n-ary" and "both" by "all"
in the above description has the desired effect.
Second, if we evaluate the arguments to these predicates in some order,
say left-to-right, then we could immediately determine that %3and%*
is false as soon as we come across an argument which
evaluates to %ef%*; similarly a call on %3or%* for an arbitrary number
of arguments can be terminated as soon as we evaluate an argument giving
value %et%*.
But if we insist that %3and%* and %3or%* are %2functions%* we can
take advantage of neither of these observations. Functions have a fixed
number of arguments, all of which are evaluated. The solution should be
clear: define %3and%* and %3or%* as special forms and handle
the evaluation ourselves. Presently, the
only way to handle special forms is to make explicit modifications
to %3eval%*. This is not terribly difficult (or desirable). When
we see a more realistic version of %3eval%* and Co. in {yonss(P30)}, we will have
a simple way to add such forms. See also {yon(P54)}.
.BEGIN CENTERIT;SELECT 3;GROUP;TABIT1(17);
%1Recognizers for the predicates must be added to %3eval%*:
.SELECT 3;
←isand[e] → evand[args[e];environ];
←isor[e] → evor[args[e];environ];
←.....%1 where:%3
.P53:
evand <= λ[[l;a]\[null[l]→ %et%*;
\ eval[first[l];a] → evand[rest[l];a];
\ %et%* → %ef%*] ]
evor <= λ[[l;a]\[null[l] → %ef%*;
\ eval[first[l];a] → %et%*;
\ %et%* → evor[rest[l];a]] ]
.END
Notice⊗↓
Again notice that the abstract versions of
%3evand%* and %3evor%* know that the arguments are presented as a sequence.
The structure of the recursion implies a left-to-right evaluation←
the explicit calls on %3eval%*. This is expensive, but cannot be helped.
Later we will show a less costly way to handle those "non-functions" which
only have an indefinite number of arguments, all of which are
to be evaluated (see {yonss(P18)} on macros).
.CENT(Problems)
I What is the difference between a special form and call-by-name? Can call-by-name
be done in LISP (without redefining %3eval%*)?
.P71:
II %3select%* is a special form to be called as:
%3select[q;q%41%*;e%41%*;#...#;q%4n%*;e%4n%*;e]%1 and to be evaluated as follows:
%3q%* is evaluated; the %3q%4i%1's are evaluated from left to right until
one is found with the value of %3q%*. The value of %3select%* is the value
of the corresponding %3e%4i%1. If no such %3q%4i%1 is found the value of
%3select%* is the value of %3e%1. %3select%* is a precursor of the
%2⊗→case statement↔←%*. See {yon(P70)}.
Add a recognizer to %3eval%* to handle %3select%* and write a function to
perform the evaluation of %3select%*.
.SS(The %3prog%*-feature,,P37:)
.TURN ON "←";
%1
Though recursion is a significant tool for constructing LISP programs,
there is another technique for defining algorithms in LISP. It is
an iterative style of programming which is called the
⊗→%3prog%*↔←-feature in LISP.
First a general disclaimer: Some pure ("For my next trick I'll walk
on the water") LISP programmers feel that there is something slightly
obscene about writing iterative style programs. This isn't quite
true, but there are some good reasons for emphasizing recursion.
%21.%* Anyone can write an iterative scheme. Recursion is a powerful
tool and very possibly a new programming technique
for you. You should exercise your skill and resort to the %3prog%*
as a last resort.
%22.%* Two of the usual trappings of iteration are %2labels%* and %2go-to%*'s.
These are truly tools of the devil. In recent years several studies
by reasonable men have shown that algorithms which resort to labels
and gotos tend to be harder to read, harder to modify, sloppy in their
analysis of the process being coded, and generally ugly. The label
and goto hack invites patching over poor analysis instead of
reanalyzing the problem.
%23.%* With the control of the loop structure (either recursion or some
kind of controlled iteration, e.g., a FOR or WHILE statement) in the
hands of the language rather than in the hands of the programmer, the
static text and the dynamic flow of the execution have a parallel
relationship. This has had a beneficial effect for studies
investigating the provability of properties of programs.
Now that this has been said, here's our discussion of %3prog%*s.
Many algorithms present themselves more naturally as iterative
schemes. Recall the recursive computation of the length of a list given
on {yon(P19)}. %3length%* seems inordinately complex; our sympathies
lie more with %3length%41%1. Even this is not as straightforward as you would
expect for such a simple calculation. Rather, consider the following:
%21.%* Set a pointer to the given list. Set a counter to zero.
%22.%* If the list is empty, return as value of the computation, the
current value in the counter.
%23.%* Otherwise, increment the counter by one.
%24.%* Set the pointer to the %3rest%* of the list.
%25.%* Go to line 2.
The new ingredients here are:
%21.%* An ⊗→assignment↔← statement: "set a ...".
%22.%* Access to some new cells: the pointer, the counter.
%23.%* Sequencing (albeit usually implied) between statements: line %21%*, then line %22%*...
%24.%* Gotos and labels.
%25.%* An explicit exit from the procedure: line %22%*.
Here is the LISP %3prog%* version of the length function:
.BEGIN TABIT2(15,17);SELECT 3;GROUP;TURN OFF "←";
.P45:
⊗→length↔← <= λ[[x]prog[[l;c]
\\l ← x;
\\c ← 0;
\a\[null[l] → return[c]];
\\l ← rest[l];
\\c ← c+1;
\\go[a]] ]
.END
A few points should be made: The ⊗→%3prog%*-variables↔←, %3l%* and %3c%1, are ⊗→local variable↔←s.
That is, they only have meaning within this definition of %3length%*. If they
appeared in some program which called %3length%*, then the right thing
would happen; the old values would be saved (like λ-bindings) and then
restored after the call on %3length%* has been completed. Among other things,
this allows %3prog%*s to be used recursively.
Though assignment statements are normally executed for their effect on the
environment -- they have side-effects, assignments in LISP also have a value.
The value of an assignment is the value of the right-hand-side.
Conditional expressions have a slightly different effect inside %3prog%*s. If
none of the predicates in the conditional are true, then the statement following
the conditional is executed.
.P83:
Labels are clear; they are identifiers. Labels are local, that is must be found
within the body of the current %3prog%*. Labels may conflict with the
λ-variables or %3prog%*-variables; the evaluator for %3prog%*s can resolve the
conflicts by context.
⊗→%3go%*↔← is a little more complicated. %3go%* is a special form. If the argument
to %3go%* is %2atomic%* then it is interpreted as a label; otherwise, the argument
is %2evaluated%* and the result of the evaluation is interpreted as a label.
This results in a useful programming trick. Let %3l%* be a list of dotted pairs,
each of the form, %3(%* object%4i%* . label%4i%3)%1. At each label%4i%* we
begin a piece of program to be executed when object%4i%* has been recognized.
Then the construct:
.P25:
%aUGH%*←%3go[cdr[assoc[x;l]]]%*
can be used to "dispatch" to the appropriate code when %3x%* is one of the
object%4i%*. This is an instance of %2⊗→table-driven↔←%* programming.
This programming trick shows "go-to" programming in its most pervasive form.
The blocks of code dispatched to can be distributed throughout the body of the
%3prog%*. Each block of code will usually be followed by a %3go%* back to
the code involving equation %aUGH%* (above). In fact the argument %3l%* in %aUGH%*
may be %2global%* to the %3prog%*-body.
The general effect is to make a %3prog%* which is very difficult to understand.
The LISP %3select%* ({yon(P71)}) will handle many of the possible applications
of this coding trick and result in a more readable program. The case-statement ({yon(P70)})
present in some other languages is also a better means of handling this problem.
The ⊗→%3return%*↔← statement evaluates its argument, locates
the %3prog%* surrounding the %3return%*, and returns the value of its argument
to the caller of the %3prog%*.
When a %3prog%* is entered the %3prog%*- (local) variables are initialized to %3NIL%*.
The body of the %3prog%* is a sequence of statements and labels. Statements are
executed sequentially unless a %3go%* or %3return%* is evaluated.
If a %3prog%* runs out of statements then %3NIL%* is returned.
Returning from a %3prog%* unbinds the local variables.
Now to the problem of translating %3prog%*s into a S-expression representation.
We need be a little careful. First notice that %3prog%* is a %2⊗→special form↔←%*
(like %3COND%* and %3QUOTE%*). The construct:
←%3prog[[l;c]....]%* will be translated to:
←%3(PROG(L C) ...)%*.
But %3PROG%* is not the translate of a function
any more than %3(QUOTE ...)%* or %3(COND ...)%* is.
So the body of the %3prog%* must be handled specially by a
new piece of the evaluator.
.TURN OFF "←";
Similarly we must be careful about the interpretation of ←. First,
we will write %3x ← y%* in prefix form: %3←[x;y]%*. Now, using our usual LISP
technique we might map this to:
.BEGIN CENTER
%3(SET X Y).%* Is ← or ⊗→%3SET%*↔← a function in the usual LISP sense?
.END
Not likely; for if %3x%* and %3y%* have values %32%* and %33%*, for example, then the
call-by-value interpretation of %3←[x;y]%* would say %3←[2;3]%*. This was not our
intention, hopefully. What do we want? We want to evaluate the second argument
to ← while stopping the evaluation of the first argument. But LISP does
have a device for stopping evaluation: ⊗→%3QUOTE%*↔←. So we can define %3SET%* as a normal
LISP function, provided we
.BEGIN CENTER
translate %3x ← y%* as %3(SET (QUOTE X) Y)%*.
.END
.TURN ON "←";
For example, look at the evaluation of %3(SET (QUOTE X)(PLUS X 1)).%*
Assume the current value of %3X%* is %32.
SET%* is a function; evaluate the arguments from left to right. The
value of %3(QUOTE X)%* is %3X%*; the value of %3(PLUS X 1)%* is %33%*; assign %33%* to %3X%*.
Question: what does %3(SET Z (PLUS X 1))%* mean? Well if the current value of
variable %3z%* (represented by %3Z%*) is an identifier (non-numeric atom), then
%3(SET Z (PLUS X 1))%* makes sense. Assume the current value of %3Z%* is %3A%*, then
the effect of the %3SET%* statement is to assign the value %33%* to %3A%*.
Normally when you are making assignments, you want to assign to a %2name%* and
not a %2value%*; thus you will be using the form
←%3(SET (QUOTE ...) ...).%*
As a convenience an abbreviation, ⊗→%3SETQ%*↔←, has been made available:
←%3(SETQ ... ... )%* means %3 (SET (QUOTE ...) ...).%*
Again note that %3SETQ%* is not (the translate of) a function. It may be
defined as a special form or consider as a notational convenience like %3list%*.
Here is a translation of the body of the %3prog%* version of %3length:%*
.GROUP
%3
.BEGIN TABIT3(10,17,20);
\(LAMBDA(X)
\\(PROG(L C)
\\\(SETQ L X)
\\\(SETQ C 0)
\\A\(COND ((NULL L)(RETURN C)))
\\\(SETQ L(REST L))
\\\(SETQ C (ADD1 C))
\\\(GO A) ))
.END
.APART
%1
Now that assignment statements are out in the open let's reexamine "<=".
We already know ({yon(P89)}) that "<=" does more than simply associate the
right hand side with a symbol table entry of the left hand side; it must also
associate an environment with the function body, and this environment is to be
used for accessing global variables. This operation of associating environments
is called forming the %2⊗→closure↔←%*. We thus might be tempted to say:
.BEGIN CENTER;SELECT 3;TURN OFF "←";
f <= λ[[ ... ] ...] %1is%* f ← closure[λ[[ ...] ...] ].
.END
where %3closure[g]%* is to give as value a representation of the function %3g%*
associated with the %2name%* of the current environment
⊗↓We may %2not%* associate the current %2value%* of the
environment (i.e. symbol table). Why?←.
Alas, this implementation is still not sufficient as we will see in {yonss(P90)}.
.REQUIRE "PROB7.PUB" SOURCE_FILE;
.CENT(Another form of iteration)
.P136:
Some of the painful behavior of unbridled %3prog%*s can be eliminated
by the use of a new control structure for list operations. The
construct is called %3lit%* and is defined as:
.BEGIN CENTERIT SELECT 3;
←lit <= λ[[x;y;f][null[x] → y; %et%* → f[first[x];lit[rest[x];y;f]]]]
.END
Then for example %3append%* could be expressed as:
.BEGIN CENTERIT; SELECT 3;
←append <= λ[[x;y]lit[x;y;concat]]
.END
***more on lit and give a while construct***
.CENT(Alternatives to %3prog%*)
.P133:
all about lit
.SS(Rapprochement: In retrospect,,P90:)
We will begin with a sketch of the LISP evaluator as it would now appear:
basic %3eval%* plus the additional artifacts for %3label, function%*, and %3prog%*.
This evaluation process is very important and, though it is perhaps new material,
should appear quite intuitive.
%3eval%* and friends are not to be memorized. If
you cannot write functions equivalent to %3eval%*, then you have not understood
LISP evaluation.
Finally, we will examine some of the weaknesses present in LISP.
There are alternatives to some of the LISP techniques and there are some things
which, in retrospect, LISP could have done better.
First, the basic evaluator.
There are two arguments to %3eval%*: a %2form%* ⊗↓throughout this section we
will say "form", "variable", "λ-expression", etc. rather than "an S-expression
representation of a" ... "form", "variable", "λ-expression", etc. No
confusion should result, but remember that we %2are%* speaking imprecisely.←, that
is an expression which can be evaluated; and second, an association list or
%2symbol table%*. If the form is a number, the atom %3T%* or %3NIL%*, return that
form. If the form is a variable, find the value of the variable in the current
table or environment. If the form is a (non numeric) S-expression constant, return
that constant. If the form is a conditional expression, then evaluate it
according to the semantics of conditional expressions. If the form is a
%3prog%*, evaluate the body of the %3prog%* after binding the local %3prog%* variables
to %3NIL%*⊗↓Perhaps you can now see why quoted expressions, conditional expressions,
and %3prog%*s are called %2special forms%*.←.
The form might also be a functional argument. In this case evaluation consists of
forming the ⊗→closure↔← of the function, i.e., associating the current environment
with the function. In LISP this is done with the %3FUNARG%* device.
Any other form seen by %3eval%* is assumed to be a function followed by arguments.
The arguments are evaluated from left-to-right and the function is then applied
to these arguments.
The part of the evaluator which handles function application is called %3apply%*.
%3apply%* takes three arguments: a function, a list of evaluated arguments, and
the current symbol table. If the function is one of the five LISP primitives
then the appropriate action is carried out. If the function is a λ-expression
then bind the formal parameters (the λ-variables) to the evaluated arguments and
evaluate the body of the function. The function might also be the result of a
functional argument binding; in this case apply the function to the arguments
in the saved environment rather than in the current environment.
We might also be applying the label operator. All we need do here is apply the
function definition to the arguments in the current context augmented by the
binding of the function name to its definition.
We have saved what seems to be the simplest for last. That is, the function
has a name.
What we do is look up that name in the current environment.
To look something up means to find its value.
Currently we expect that value to be a λ-expression, which is then applied.
However, since function names are just variables, there is no reason that the
value of a function name could not be a simple value, say an atom, and
%2that%* value
applied to the arguments, etc. Indeed examination of %3apply%* on {yon(P69)}
will show that %3apply[X;#((A B))#;#((X#.#CAR)#...#)]%* %2will%* be handled correctly.
The obvious extension of this idea is to allow a %2⊗→computed function↔←%*. That
is, if the function passed to %3apply%* is not recognized as one of the preceding
cases, then evaluate the expresssion until it is recognized. Thus
we will allow such forms as:
.BEGIN CENTERIT;SELECT 3;
←((CAR#(QUOTE#(CAR#(A#.#B))))#(QUOTE#(A#.#B)))
.END
What conceptual difficulties are present in LISP evaluation?
One of the more important defects in LISP is its inadequate handling of function
valued variables, functional arguments being a particular case which LISP does
attend to correctly.
LISP was the first language to handle functional arguments so it is not too suprising
that all is not quite right.
The %3FUNARG%* device is sufficient for simple uses of functional arguments and
closures. However, we should like to return functions and closures as %2values%*.
Returning ⊗→open function↔←s is easy; we can simple %3cons%*-up λ-expressions.
Returning closures is more difficult. So perhaps we should finally lay "<=" to
rest. As you might have expected, difficulties occur when we examine recursive
definitions.
.P94:
Consider the canonical example:
.BEGIN CENTERIT;SELECT 3;
←fact <= λ[[x][x=0 → 1; %et%* → *[x;fact[x-1]]]]
.END
First, we attempted to implement this as:
.BEGIN CENTER;SELECT 3;TURN OFF "←";
fact ← closure[λ[[x] ... fact[x-1] ..]
.END
we must use the %2name%* of the environment current at closure-time rather
than the value, since at the time we form the closure the name %3fact%* is
a free variable. Any value associated with %3fact%* at this time would be the
wrong one ⊗↓%2One%1 value would suffice; what is it?←. For example:
.BEGIN TABIT3(30,34,38);GROUP;TURN OFF "←";
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3foo%*
.APART
.GROUP
Then executing %3fact ← closure[ ... ]%* should give:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
.APART
.GROUP
rather than:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : fact|foo%*.
.END
You should reflect on the current development before reading further.
There are problems however if we just associate the name of the environment
in the closure operation.
For example, consider the following sequence:
.BEGIN TABIT3(30,34,38);GROUP;
An initial environment:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
.apart;
.GROUP;
Now execute %3foo <= fact%*, giving:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
\%3foo%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
.END
.BEGIN TURN ON "#";
This state should start worrying you; the "intent" of %3foo#<=#fact%*
was to make %3foo%* synomymous with %3fact%*. That clearly is not the case
though the right thing happens if we were now to evaluate an expression
involving %3foo%*. The problem is that it happens for the wrong reason: the occurrence
of %3fact%* in the body of %3foo%* will find the right definition of %3fact%*.
One more step will lead to disaster: %3fact#<=#loser%*.
.END
Now we have really lost. Though it is perfectly reasonable to redefine %3fact%*
--it is only a name-- our intent was clearly to keep %3foo%* as a realization
of the factorial function. This intent has not been maintained ⊗↓%1This example
was shown to the author by D. B. Anderson.←. So at least in the presence of
recursive definitions we really must be careful.
The %2real%* intent of the recursive definition of %3fact%* was to define a
function to effect the computation of factorial and then to %2name%* that
function %3fact%*.
Or, put another way, the effect of
.BEGIN CENTER;
%3fact%* <= %3λ[[x] ...fact[x-1]]%1 is quite different from:
%3foo%* <= %3λ[[x] ...fact[x-1]].
.END
Clearly we failed to handle this general problem, though
LISP does handle its %3label%* subset correctly.
Let's see what's really going on.
Perhaps an easier avenue lies in looking first at assignments to %2simple%*
variables rather than functional variables. It is clear how the environment
should change during the execution of a sequence say:
.BEGIN TABIT1(30);TURN OFF "←";SELECT 3;GROUP;
\x ← 3
\y ← x
\x ← 4
.END
Let's try something slightly more complicated:
.BEGIN TABIT1(30);TURN OFF "←";SELECT 3;GROUP;
\x ← 1
\x ← x%82%* - 6
.END
Here we simply assign %31%* to %3x%*; then while we are evaluating the right
hand side of the next statement, we look up the current value of %3x%*,
perform the appropriate computations, and finally assign the value %3-5%*
to %3x%*. Compare %2this%* to the %3fact%* definition; there we made a point
of not "evaluating" %3fact%* on the right hand side of "<=". What %2is%* happening?
.P109:
.BEGIN TURN ON "#";TURN OFF "←";
Notice that %2after%* an assignment like %3y#←#x%* has been executed, then
equality holds between the left-hand and right-hand quantities ⊗↓Indeed, a
logic of programs, due to C.A.R. Hoare, exploits this fact in interpreting
assignment.←.
Let's look more closely at the relationship between "←" and "=".
Consider %3x#←#x%82%*#-#6%1. After the assignment is made, equality
does %2not%* hold between left- and right-hand sides. We must be
careful.
Consider %3x#=#x%82%*#-#6%1.
Interpreting this expression as an equation ⊗↓%1Not as an expression
whose value is true or false depending on the current value of %3x%*← we find
two solutions: let %3x%* have value %3-2%* or %33%*.
Now if we examine our "definition" of %3fact%* in %2this%* light, interpreting
"<=" as being related to "=" rather than "←", then we are faced with an equation
to be solved. Only now the form of the solution is a %2function%* which satisfies
the equation. There are frequently many solutions; there may be no solutions.
In the our case there is at least one solution: the usual factorial function.
So what we %2should%* write is something more like:
.BEGIN CENTER;SELECT 3;
fact ← %1a solution to the equation%*: f = λ[[x][x=0 → 1; %et%* → *[x;f[x-1]]]];
.END
.END
Questions of when solutions exist, how many, and which are the "best"
solutions will not concern us here.
We will return to these points in detail in {yonss(P85)}.
This discussion should make clear the necessity of distinguishing between
assignment and equality; unfortunately, many programming languages use notations
which tend to obscure these differences.
.CENT(Problems);
I Write complete versions of %3eval%* and %3apply%*.
II The example of this section which points out the difficulties of closures
is not described in LISP. Can you write a pure LISP (i.e., without %3prog%*s)
program which will also exemplify this difficulty?
.CENT(More postmortem)
It's also time to tell the truth about data structures.
We began our study of LISP with an
attempt to be formal, precise and abstract. However we frequently
had to mention that the programming aspects of LISP differed
with this or that feature we were explaining. When we began
our discussion of the mapping of LISP expressions to Symbolic
Expressions then we had to confess even more; we don't express
our algorithms as LISP expressions, but write in the S-expression
representation. One effect of this is to be faced with reasonably
ugly programs; some of the more positive attributes are discussed in
{yonsec(P175)}.
.P132:
Experience has shown another defect in the design of LISP as a viable
programming language. This is LISP's lack of data-structure definition
facility. The only data structure which LISP admits to is the
class of Symbolic Expressions.
In a half-hearted way, LISP lists are data structures. List-notation
is a recognizable input and the output from LISP programs will be converted
to list-notation wherever possible.
There are the requisite primitives to manipulate lists. But that's where
things stand. LISP will allow you to manipulate the %2representation%*
of the lists; that is bad. The LISP S-expr operations like %3car, cdr%*, and
%3cons%* operate without complaint on lists, even though we have
repeatedly said that these functions are S-expression functions.
The effect of this rather cavalier behavior of LISP is to present the potential
LISP user with an incredible potentiality for generating programming errors.
It also removes from the user a powerful debugging tool. If we
write programs such that the type of each data object must be given⊗↓For example,
in {yonss(P41)} the types on the arguments to %3diff%* should be <poly> and
<variable> %2not%* list and atom.←,
and if we write each function such that the process of binding arguments
to values must check that the type of the actual parameter agrees with
the type of the parameter of the function, then a very large number
of programming errors can be located almost as soon as they occur.
You can think of the parameter-passing mechanism as sort of a "fire-wall",
which will attempt to contain the deviant behavior within the particular
function. Any function which gets called has a right to expect
that it will be called with reasonable values. Part of being
reasonable is having the correct number of arguments given to it;
%3cons[A;#B;#C]%* is as bad as %3cons[A]%*. Part of being reasonable
is having the right kind of arguments; we don't expect reasonable results
from %3sub1[A]%*. All functions should expect to be treated courteously;
we should not expect that the functions are sufficiently omniscient
to convert an argument of the wrong kind into a proper one.
This violates the whole point of supplying abstract data structures.
If a function is written to expect an argument of type %3polynomial%*
then it should complain bitterly if it receives an argument of type
%3list%* even though the current representation for polynomials
might be special instances of lists.
But many current languages do indeed offer such omniscience. Fortran
calls this service "conversion"; Algol68 calls it %2"⊗→coercion↔←"%*.
It is a mistake.
It's a mistake for several reasons. First, coercion tends to result in
very obscure bugs. If each function blithely accepts whatever argument
it is given and attempts to use it in its computation, then the first
inkling of trouble will occur when a primitive function is called and
causes some complaint. Typically this indication of error is way past
the actual source of the difficulty. The only alternative is to
explictly code tests into the entry code of each function definition
which will test the acceptability of each argument. This will suffice
but is of unnecessary expense and complexity. The computation
needed to validate the argument is much less complex than that
needed for a general computation. That test can be carried out in the
subconscious of the parameter passing mechanism with minimal cost.
LISP also falls into this dismal category. It has no capability to
define and maintain abstract data structures. We can certainly
add a prolog to each LISP function to check for consistency; but
that's an expensive use of the programmer's time and the computation time.
So what typically happens is that the tests are left out⊗↓"There are
no bugs in %2my%* program!"← and when the program doesn't run correctly
the errors which the user sees are messages generated by detectable
aberations deep in the subconscious of LISP. There is usually
a significant bit of detective work necessary to uncover the real
culprit.
Another embarassment it LISP's handling of the truth-values %et%* and
%ef%*. LISP maps these values onto %3T%* and %3NIL%*.
Truth values are not S-exprs and should not be
treated as such.
As we have seen,
this is an unnecessary use of representation-dependent programming.
It causes grief if when we try to be precise about discussing other
kinds of data-structures. The truth values %2do%* get representations
as S-exprs when we represent LISP programs as data structures; but
that's another story completely.
As we have just seen,
symbolic expressions are the only real data structure. We almost
have sequences as a data structure, and the necessary ingredients
are there to build abstract data structures. But the questions of
integrity in using such defined data structures is left totally in the
hands of
the programmer. This typically a poor repository for such a fragile
commodity; expediency and
efficiency usually take precedence over careful design.
.SEC(Running on the machine,,P175:)
****this section needs extensive revision***
%1
This section is for the brave few who wish to run on a machine.
The %2programming language%*, LISP, is the Sexpr translation
of the LISP functions and %3prog%*s that we have been writing. What are
some of the implications of writing in Sexpr form?
First, LISP becomes the world's largest parenthesis sink. It makes
for difficult reading at times, but there are formatting tricks, and
editing programs which will help the user match parens and locate
parts of the program. (It only hurts for a little while). There is
a practical benefit which greatly outweighs this anguish factor:
since proper input to LISP programs are Sexprs and since we are writing
LISP programs in Sexpr form then on input, data and program are
indistinguishable in format; i.e., the are both binary trees.
Obviously for evaluation, programs must have a very special structure,
but program and data are both trees just as in more hardware machines
the contents of locations which contain data are indistinguishable
in format from locations which contain instructions.
On a typical contemporary machine it is the way in which a location is accessed
which determines how the contents of that location are interpreted.
If the central processor accesses the location via the program
counter, the contents are assumed to be an instruction. That same
location can usually be accessed as part of a data manipulating
instruction. In that case, the contents are simply taken to be data.
LISP is one of the few high-level languages which also has this property.
It gives the programmer exceptional power. Let's complete the
analogy. The central processor of a LISP machine is, ⊗→%3eval%*↔←.
If %3eval%* references a binary tree via its `program counter', then
that tree is decoded via the internals of %3eval%*. If a tree is
referenced as input to an Sexpr-massaging function, then it is
taken as data. As a result, a LISP program can %3cons%*-up a new
Sexpr which can then be evaluated (i.e., interpreted as an intruction).
The simplest way to communicate with such a machine is to read an
sexpr translate of a LISP expression into memory, evaluate the expression, and
print the result. Several implementations of LISP use a slight variant
of this called the "read-eval-print" loop:
←%3prog[[]##a##print[eval[read[];NIL]];go[a]]%*.
.BEGIN TURN ON "#";
Since programming is done using the sexpr translation of LISP functions
it becomes convenient to sometimes say "...#the function %3CAR%*#..." or
"...#write a %3PROG%*#...",#... when we actually mean %3car%* and %3prog%*,#....
The actual LISP functions are written in the language generated by syntax equations
which we have been including. This language is called the meta-language, and
the LISP expressions called ⊗→M-expressions↔← or M-exprs.
The distinction between M-exprs and their S-expr translations must not be forgotten.
.END
Though %3eval%* is the equivalent of the basic Cpu of the ⊗→LISP machine↔←,
there is another artifact in many versions of LISP to communicate
with the outside world. As with most pieces of I/O equipment,
it leaves something to be desired. Its name is %3evalquote%*.
.SS(%3evalquote%*,%3evalquote%*)
The basic structure of the %3evalquote%*-loop is:
%21.%* read in a (Sexpr representation of) function . I.e., a name
or a lambda expression.
%22.%* read in a list of (evaluated) arguments to be applied to the function.
%23.%* apply the function (of step %21.%*) to the argument list.
%24.%* print the value.
%25.%* go to step %21.%*
Thus the structure of the loop is essentially:
.BEGIN TABIT2(24,27);TURN OFF "←";
%3
.GROUP
prog[[fn;x;z]
\l\fn ← read[ ];
\\x ← read[ ];
\\z ← evalquote[fn;x];
\\print[z];
\\go[l]]]
%1where: %3evalquote <= λ[[fn;x]apply[fn;x;NIL]]
.APART
%*
or more concisely:
%3
.GROUP
\prog[[ ]
\ l print[evalquote[read[ ];read[ ]]];
\ go[l]]
.APART
%*
.END
.GROUP
where:
%21.%* the function, ⊗→%3read%*↔←, negotiates with an input device to read an
Sexpression.
%22.%* the function, ⊗→%3print%*↔←, prints the value of its argument on an output
device.
.APART
The details of %3read%* and %3print%* will appear when we discuss implementation
of LISP.
.BEGIN TURN ON "#";
Here's an example of the behavior of %3evalquote%*. Assume that the input
device contains %2CAR#((A#B))%*. Then the first %3read%* in %3evalquote%* gets
%2CAR%* (a string representing an Sexpr), and the second read gets %2((A#B))%*;
then %3apply%* gets called as:
%3
←apply[CAR;((A B));NIL].
%*
%3apply%* says that this is the same as evaluating
%3car[(A#B)]%*, and thus returns %3A%* as its answer, which is dutifully
printed by %3print%*.
.END
So %3evalquote%* can be used as a `desk calculator' for LISP.
If we wish to evaluate an expression %3f[a%41%*;#...#a%4n%*]%1 for
%3a%4i%1 constants, i.e. sexprs, then %3evalquote[F;#(a%41%*#...#a%4n%*)]%1
will do it for us. But the %3evalquote%*-loop will %2not%* evaluate arguments;
the a%4i%1's in the call on %3evalquote%* are not expressions, not translates
of constants, but are constants.
Typing %2CONS#((CAR#(QUOTE#(A)))#(QUOTE#(B)))%* to the %3evalquote%*-loop
will result in %3((CAR#(QUOTE#(A)))#.#(QUOTE#(B)))%*.
The %3evalquote%*-loop is an anomaly. It does not expect the usual representation
of a LISP form say %3(F e%41%* ... e%4n%*)%1 where the %3e%4i%1's
are to be evaluated. It expects a %2pair%* of sexprs representing a function
and %2evaluated%* arguments. The benefits of having "functional-notation" as
input and having arguments implicitly evaluated are greatly outweighed
by the confusion introduced by having two formats for LISP expressions, one for
the "top-level" and a different one everywhere else.
Certainly
before we can do any reasonable amount of calculation, we must be
able to define and name our own functions. The %3label%* operator, or
calling %3eval%* with an intial symbol table containing
our definitions will suffice, but this is not particularly elegant.
We would like our definitions to have some permanence.
A function (actually a %2special form%*, if you have been paying attention)
named ⊗→%3define%*↔←, whose effect is to add definitions to %3apply%*, has been
provided. The actual behavior of %3define%* will appear in the sections on
implementation.
.P51:
%3define%* is the name of a special form (its arguments are not evaluated)
of one argument, say %3x. x %*is a list of pairs: %3((u%41%* v%41%*) ... (u%4n%* v%4n%*)).%1
Each %3u%4i%1 is the name of a function and each %3v%4i%1 is the λ-expression
definition. Actually each %3u%4i%1 and %3v%4i%1 is the %2Sexpr translation%* of the
name and the function, but you knew that anyway. The effect of %3define%*
is to put the appropriate definitions in the symbol table.
For example we might wish to define the function which reverses a list as:
%3
.BEGIN NOFILL;TURN OFF "←";
.GROUP
reverse <= λ[[x] prog[[y;z]
y ← x;
z ← NIL;
a [null[y] → return [z]];
z ← cons[car[y];z];
y ← cdr[y];
go[a];]]
%1
.APART
Then the following would make this definition grist for the %3evalquote%* mill.
.GROUP
%3
DEFINE((
(REVERSE (LAMBDA (X)
(PROG (Y Z)
(SETQ Y X)
(SETQ Z NIL)
A(COND ((NULL Y)(RETURN Z)))
(SETQ Z (CONS (CAR Y) Z))
(SETQ Y (CDR Y))
(GO A) )))
))
%1
.APART
and subsequently:
%3REVERSE ((A B C)) %1would evaluate to: %3(C B A).%1
.END
%1
C. Weissman's LISP PRIMER is an excellent source of machine examples
and should be consulted now. Always remember that you are writing the sexpr representation
of LISP functions and expressions. Do not confuse the two.
.SS(A project: extensions to %3eval%*,%3eval%*,P67:)
.SELECT 1;
.BEGIN TURN ON "#";
Consider a p%4i%* → e%4i%* component of a conditional expression. As
conditionals are now defined, e%4i%* must be a single expression to be evaluated.
In subsets of LISP without side-effects this is sufficient. It is, however,
convenient in practice, i.e., in LISP %2with%* side effects, to extend conditionals to include components of the
form: p%4i%*#→#e%4i1%*; ... e%4in%*.
This extended component is to be evaluated as follows: if p%4i%* is true, then
evaluate the e%4ij%*'s from left to right, with the value of the component to be
the value of e%4in%*.
For example, this feature, used in %3prog%*s would allow us to replace:
.BEGIN TABIT2(10,14);
\\ ....
\\[p%41%* → %3go[l]%1]
\\ ...
\%3l\%1e%41%*;
\\e%42%*;
\\ ...
\\%3go[m];
.END
with: [p%41%* → e%41%*;e%42%*; ... %3go[m]%*;]. This is certainly easier to read.
This extended conditional expression is available on versions of LISP 1.6 on the PDP-10.
Here is another cosmetic. Instead of requiring that the body of a λ-definition
be a single expression: %3λ[[#...#]f[#...#]]%*, allow bodies of the form:
%3λ[[#...#]f%41%*[#...#];#...#f%4n%*[#...#]]%1. The evaluation of such
should mean; bind as usual, evaluate the %3f%4i%1's from left to right, returning
as value %3f%4n%*[#...#]%1.
This next extension to %3eval%* was derived from the syntax of ⊗→MUDDLE↔←, ⊗→CONNIVER↔←, and
⊗→MICRO-PLANNER↔←.
We have seen that LISP calling sequences are of two varieties: functions calls,
evaluating %2all%* of the arguments; and special forms, evaluating %2none%* of the
arguments.
In an attempt to generalize this regime we might allow the evaluation of some
of the arguments and enlarge on the domain
of objects which can appear in the list of λ-variables.
We might partition the formal parameters into obligatory parameters, optional
parameters, and an excess collector.
Obligatory parameters %2must%* have corresponding actual parameters; optional
actual parameters are used if present, otherwise declared default
values are used; and
if there are more actual parameters than the formals encompassed by the first
two classes, then they are associated with the excess collector.
To be more precise consider the following possible BNF equations:
.BEGIN TABIT1(13);TURN OFF "←";
<varlist>\::=[<obligatory> <optional> <excess>]
<obligatory>\::= <par>; ...<par> | %cε%*
<optional>\::= "optional" <opn>; ... <opn> | %cε%*
<excess>\::= "excess" <par> | %cε%*
<par>\::= <variable> | @<variable>
<opn>\::= <par> | <par> ← <form>
.END
The associated semantics follows:
.BEGIN INDENT 0,10;TURN OFF "←";
%21.%* The formal parameters are to be bound to the actual parameters from
left to right (as usual).
%22.%* There must be an actual parameter for each obligatory parameter, and
if there is no excess collector there may not be more actual parameters than
formals. (There may be less if we have optionals.)
%23.%* If a <variable> in a formal parameter is preceeded by a "@", then
the corresponding actual parameter is %2not%* evaluated.
%24.%* We might run out of actual parameters while binding the optionals.
If we do, then we look at the remaining formal optionals.
If a formal is simply a <par> then bind it to %3NIL%*; if a formal is
@<variable> ← <form> then bind the <variable> to the <form>; or if the formal is
<variable> ← <form>, bind <variable> to the value of <form>.
%25.%* Finally, the excess collector is bound to a list of any remaining
actual parameters in the obvious way: if <par> is <variable> then form a list
of the values of the remainder; if it is @<variable>, bind <variable> to the
actual list. If there is no excess, bind to %3NIL%*.
.END
.BEGIN TURN OFF "←";
These same languages have extended %3prog%*-variables slightly, allowing
them to be initialized explicitly. If a %3prog%*-variable is atomic,
intialize it to %3NIL%*, as usual. If it is of the form <variable>#←#<form>
then initialize it to the value of the <form>.
.END
.END
.BEGIN TURN OFF "←";turn on "\"; tabs 10,20;
Here are some examples:
1. In the initialization of %3length%* on {yon(P45)}, we could write:
%3...#prog[[l#←#x;#c#←#0]#....%*
2. %3list%* could now be defined as: %3λ[[%1"excess"%* x]x]%*.
3. Consider the following definition:
.BEGIN NOFILL;
.group
\%3baz <= λ[\[x;%1@%*y;%1"optional"%* z; u ← 0; %1"excess"%* v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].
.APART
%1Then a call of:
%3eval[(BAZ 2 (CAR (QUOTE (X Y)) 4 5 6 7 (CAR (QUOTE (A . B))));NIL]%* would print:
%3
2
(CAR(QUOTE (X Y))
4
5
(6 7 A)%*
(and return value: %3(6 7 A)%*.
Similarly defining;
.GROUP
\%3fii <= λ[\[x;y;%1"optional"%* z; u ← 0; %1"excess"%3 v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].
%1and calling:
%3eval[(FII 2 (CAR (QUOTE (X Y)));NIL]%* prints:
%3
2
X
NIL
0
NIL.%*
.END
.END
←%2Problems%*
Design simple sexpr representations of these proposed constructs.
Make these extensions to %3eval%*. How useful are these syntactic
sugarings?
.SS(A project: Syntax-directed processes,syntax-directed)
%1
.REQUIRE "NOTES.SDP" SOURCE_FILE;
.SEC(Implementation of the Static structure of LISP,static structure)
.TURN ON "←","α";
%1
.SS(Introduction)
The material in the previous chapters has been rather abstract and
perhaps the more practical-minded reader is becoming skeptical of
this whole endeavor.
This chapter begins a discussion of the actual mechanisms which
underlie a typical implementation of LISP. However the importance of the
techniques we will describe extends far beyond the implementation of
this particular language. Most of the ideas involved in our implementation
are now considered standard system programming techniques and
are common tools in language design. LISP is particularly well-suited
to the task of explicating these ideas since many find their origins in
the first LISP implementation.
We will begin our discussion of implementation with an analysis of
storage regimes for S-expressions. As with the more abstract discussions
of representations,
the "concrete" representation which we pick for our data structures (S-expressions)
will have direct bearing on the implementation of the primitive constructor
(%3cons%*), selectors (%3car, cdr%*) and predicates (%3atom, eq%*) of LISP.
Since we are now attempting to become less mathematical and more practical,
we must worry about the efficiency of the implementation
⊗↓But not to the detriment of generality or good program design.←, and we must
worry about input/output mechanisms so that the language may communicate with the
external world.
The present chapter will develop a picture
of this %2static%* structure of an implementation, or perhaps to be more
graphic, this chapter describes the memory of a LISP machine.
The next chapter discusses the %2dynamic structure%* of LISP;
that is, the control structures necessary to properly
evaluate expressions involving recursive functions and the other LISP
constructs.
Throughout these discussions we will
walk a narrow line, attempting to remain as abstract as
possible without losing too much detail, while at the same time, not
degenerating into a discussion of coding tricks. Thus we will attempt
to describe the "logical" structure of a LISP machine, even though
a more efficient implementation might map differing logical structures onto
the same physical structures by utilizing machine-dependent techniques.
As a gesture of good faith we will now resurrect our "box-notation"
for S-expressions ({yon(P102)}) and show how this notation mirrors
the logical (and usually physical) structure of LISP memory.
Subsequently we will show how to represent primitive LISP operations
in a concise graphical notation which manipulates these boxes.
Contemporary machines can simulate these manipulations with one or
more hardware instructions.
.SS(Representation of Symbolic expressions)
We previously have expressed the dotted pair %3(A#.(B#.#C))%* as:
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
/\
/ \
/ /\
/ / \
∞3A∞* ∞3B∞* ∞3C∞*
.END
or occasionally (on {yon(P102)}) as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂αααπααα⊃ ⊂αααπααα⊃
ααααα→~ # ~ #αβααααααααααα→~ # ~ # ~
%αβα∀ααα$ %αβα∀αβα$
↓ ↓ ↓
∞3A∞* ∞3B∞* ∞3C∞*
.END
This second style of graphical representation has a direct representation
in the storage layout of a machines.
Each "double-box" will be represented by a machine location and each
arrow will be represented as a %2⊗→pointer↔←%* to another machine
location.
Notice that each box contains two pointers; therefore each corresponding
machine location, %bloc%*,
will be interpreted as containing two machine addresses. The left-hand address
will represent the %3car%*-branch; the right-hand will represent the %3cdr%*-branch:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;group;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂αααααπααααα⊃
loc ~ car ~ cdr ~
%ααααα∀ααααα$
.END
The pointers will either reference atoms or will
point to other two-pointer boxes.
Literal atoms
--#like %3A,#B,#C%*#-- will also be represented in machine locations; only here the
contents of each location will be an encoding of the name of the atom.
Obviously the
contents of such a location must not be interpreted as pointers.
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;group;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂αααααααααααααααααααααα⊃
loc ~ rep. of literal atom ~
%αααααααααααααααααααααα$
.END
To help keep track of the different uses of machine locations we
will partition our machine's memory into two disjoint spaces: %2⊗→pointer space↔←%*,
the area which will contain pointers; and %2⊗→atom space↔←%* which will
contain information like atoms which should not be interpreted as
pointers.
Thus if the first box in our example were represented by location %b100%*
and the second were represented by location %b405%*, and the atoms
%3A%*, %3B%*, and %3C%* were represented by the numbers %b40%*, %b41%*, and
%b42%*, and were to be found
in locations %b710%*, %b762%*,
and %b711%*, respectively, then the following picture beginning at location
%b100%* could represent the dotted pair %3(A#.(B#.#C))%*.
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;group;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂αααααπααααα⊃
100 ~ 710 ~ 405 ~
%ααααα∀ααααα$
...
⊂αααααπααααα⊃
405 ~ 762 ~ 711 ~
%ααααα∀ααααα$
...
⊂ααααααααααα⊃
710 ~ 40 ~
εαααααααααααλ
711 ~ 42 ~
%ααααααααααα$
...
⊂ααααααααααα⊃
762 ~ 41 ~
%ααααααααααα$
.END
Thus the left half of location %b100%* points to the representation of the
atom %3A%* and the right half points to the representation of the dotted pair
%3(B#.#C)%*.
Notice too that given the entry point into the representation --location %b100%*
in the example-- we can unambiguously discover the S-expr being represented.
This representation of Symbolic expressions is a special case of a
scheme called %2⊗→linked allocation↔←%*. The term "linked"
refers to the fact that to find succeeding elements in the representation
we must follow the explicit pointers.
The particular brand of linked allocation which we have demonstrated is called
%2⊗→single linking↔←%* ⊗↓also called "LISP-type allocation"←.
The term "single" means that only %2one%* pointer is stored as the representation of
the arrow, %b→%*.
In the case of LISP, the terminology means that the representation of each pointer
only tells us how to find succeeding elements in the structure. For
example if we were looking at location %b405%* the representation
tells us how to find the %3car%* or %3cdr%*; they're at %b762%* and
%b711%* respectively. But if we wanted to find the %2predecessor%*
of %b405%* in this representation it would require some further calculation.
We would have to start at the beginning of the S-expr representation
and look for a location such that its %3car%* or %3cdr%* is the desired cell.
If our use of S-exprs required frequent discovery of such predecessors
then we might consider an even more complex linked representation which
would also contain information about the predecessor of each node,
essentially representing %b→%* as %b↔%*. I.e.:
.P164:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "←"
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂αααπααα⊃ ⊂αααπααα⊃
α←→α→~ # ~ #←βααα←→αααααα→~ # ~ # ~
%αβα∀ααα$ %αβα∀αβα$
↑ ↑ ↑
↓ ↓ ↓
∞3A∞* ∞3B∞* ∞3C∞*
.END
One such representation is called %2⊗→double-linking↔←%*. In this representation
of LISP data structures we could store %2three%* pieces of information with each
non-terminal node:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;group;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂ααααααααααααα⊃
~ predecessor ~
loc εααααααπααααααλ
~ car ~ cdr ~
%αααααα∀αααααα$
.END
.BEGIN TURN OFF "←";
We will examine other storage techniques for complex
data structures in {yonss(P138)}.
.END
Data structures of less complexity than S-exprs have more compact
representation than linked allocation.
For example a typical representation of a vector,
or sequence of fixed length, is to store the elements sequentially
in memory⊗↓For more
details about representations of arrays and vectors see {yonss(P137)}.←.
Since each element in this structure has at most one successor
we can use the sequential addresses as implicit pointers to retrieve
successive elements. A general S-expr has two successors; thus
the linear addressing scheme of most machine memories is insufficient
as it stands.
.P140:
We will frequently wish to refer to several different S-exprs simultaneously;
for example, when we are talking about the implementation of the
function %3cons%* we will be manipulating the representations of two
S-exprs.
Similarly we will want to refer to several pieces of a single complex
S-expr; for example we might wish to "put a finger" at a specific point
in a structure and then, depending on the result of a computation
on some sub-part, move the "finger" either left or right.
To facilitate such discussions we will assume the existence
of a set of pointer registers: %bAC%41%1, %bAC%42%1, through %bAC%4n%1.
Thus, using the above example, the following represents %bAC%41%1 pointing
at %3(B#.#C)%* and %bAC%42%1 pointing at the atom %3A%*:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;group;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
AC∞41∞b AC∞42∞b
⊂αααααααα⊃ ⊂αααααααα⊃
~ 405 ~ ~ 710 ~
%αααααααα$ %αααααααα$
.END
These registers will be used as "fingers" pointing to representations
and we will soon give conventions as to their use in representing arguments
and values of computations.
.P159:
Implicit in our representation is the assurance that we can differentiate between
locations in atom space and locations in pointer space. For example assume each of
our locations can hold six digits and assume we will
store a numeric atom as its corresponding number. Then the atom
%3762711%* would be stored as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;group;
.TURN OFF "%","#","α";
.TURN ON "∞" FOR "%";
⊂αααααααα⊃
~ 762711 ~
%αααααααα$
.END
Since this is exactly the contents of location %b405%*⊗↓The vertical
bar doesn't appear in the machine's memory.← some confusion
is possible: is the contents a number or is it two pointers?
A typical trick is to parition memory such that a particular
addressing space corresponds to each of the logical spaces:
atom space or pointer space. In our example we could assume that addresses
below %b700%* are locations for pointers, while addresses, %b700%* and
above contain atoms. Thus the representation of %3762711%* would
appear in a location with address %b700%* or greater.
Though our memory system is not completed yet, we %2do%* have enough
structure around to begin a discussion of the implementation of some of the
primitive LISP operations.
.CENT(Problems)
What problems do you foresee in using the double-linking scheme for
representing LISP's S-exprs?
.SS(Representation of LISP primitives,,P26:)
.SELECT 1;
Now that we have some of the representational problems for Symbolic expressions
reasonably well in-hand we will look after the implementation of the
of the LISP primitives. We will first examine %3car, cdr, eq%* and %3atom%*
in this section, leaving %3cons%* for later.
We first need to supply some conventions for passing arguments to these
primitives and we need to describe how values computed by functions are
to be returned. Recall our introduction of the pointer registers, %bAC%41%1
through %bAC%4n%1 on {yon(P140)}. We will use these registers now.
To represent the passing of values %3v%41%*,#...#v%4n%1 to an n-ary
function %3f%* we will set the registers %bAC%41%1,#...#%bAC%4n%1 to point
to the representations of the respective values. When the function %3f%*
has successfully completed its computation, it is expected to set %bAC%41%1
to point to the representation of the final value. Nothing is assumed
about the state of the other %bAC%4i%1 registers.
.BEGIN GROUP;
For example, here's %3car%*:
.BEGIN INDENT 10,10;
Look at the cell pointed to by %bAC%41%1. If it is an atom then %3car%* is undefined.
Hopefully it is %2not%* atomic; what we should then do is put the address which is
in the %2left%*-hand side of the box, in %bAC%41%1.
.END
.END
.GROUP;
Thus for example:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41 ∞* AC∞41 ∞*
⊂ααααααα⊃ ⊂ααααααα⊃
~ 102 ~ ~ 204~
%ααααααα$ %ααααααα$
∞1==∞3car∞1=>∞b
...
⊂αααπααα⊃
102 ~204~ 22~
%ααα∀ααα$
.END
.BEGIN CENTER;SELECT 2;
Example for %3car%*
.END
.APART
Thus for successful completion %b%3car%1 expects %bAC%41%1 to be pointing
to a node in pointer space; otherwise we get an error. If the match
is successful then %bAC%41%1 is changed to point to whatever was pointed to
by the left-hand side of the selected cell.
The description of %3cdr%* is sufficiently similar that we leave it to the
reader.
.GROUP;
On {yon(P159)} we described the internal structure of LISP atoms. Using that
representation we can give a simple implementation for the predicate %3atom%*:
.BEGIN INDENT 10,10;
Does %bAC%41%1 point into that area reserved for atom names? If so, give
a represenatation of truth as value, otherwise give a representation of
false.
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41 ∞* AC∞41 ∞*
⊂ααααααα⊃ ⊂ααααααα⊃
~ 711 ~ ~ 714~
%ααααααα$ %ααααααα$
∞1==∞3atom∞1=>∞b
. . .
⊂ααααααα⊃
711 ~ "A"~ ⊗↓∞1We are writing ∞b"A"∞1 instead of the numeric encoding. Thus ∞b"A"∞1 is really ∞b42∞1.←∞b
%ααααααα$
. . .
⊂ααααααα⊃
714 ~ "T"~
%ααααααα$
.END
.BEGIN CENTER;SELECT 2;
Example for %3atom%*
.END
.APART
Notice that we did not need to examine the contents of location %b711%*;
it was sufficient to know that the location was between predetermined bounds.
If %bAC%41%1 was not pointing at an atom we would have returned a pointer to
a location containing %b"NIL"%1.
.GROUP
Finally we describe an implementation of %3eq%*.
.BEGIN INDENT 10,10;
Do
%bAC%41%1 and %bAC%42%1 both point into atom space?
If not the result is undefined. If they %2do%* then do they reference the same atom?
We can determine this latter condition in two ways: first, %bAC%41%1 and %bAC%42%1
might point to two different locations in atom space; we would have to examine the
contents of those locations; if they agreed then %3atom%* should return a
representation of truth. A more satisfactory solution is to store
each atom %2uniquely%*; one location will be reserved for %b"A"%1, etc. Now
all %3atom%* need do is make sure that %bAC%41%1 and %bAC%42%1 both point
into atom space %2and%* point to the same location⊗↓Thus no additional
memory reference is required.←.
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41 ∞* AC∞41 ∞*
⊂ααααααα⊃ ⊂ααααααα⊃
~ 711 ~ ~ 714~
%ααααααα$ %ααααααα$
AC∞42 ∞*
⊂ααααααα⊃
~ 711 ~
%ααααααα$
∞1==∞3eq∞1=>∞b
. . .
⊂ααααααα⊃
711 ~ "A"~
%ααααααα$
. . .
⊂ααααααα⊃
714 ~ "T"~
%ααααααα$
.END
.BEGIN CENTER;SELECT 2;
Example for %3eq%*
.END
.APART
In most implementations
the check for atom-ness is suppressed and a simple address comparison
is made. Non-atomic S-expressions are not usually stored uniquely⊗↓See
the problem on hash consing on {yon(P154)}.←; Thus
in most implementations
←%3eq[(A . B);(A . B)]%* is usually false, but
←%3eq[x;x] %*is usually true for any %3x%*.
Please note that we are %2not%* changing the definition of %3eq%*.
%3eq%* is still undefined for non-atomic arguments.
The preceding remarks deal only with the usual implementation of
%3eq%*⊗↓Remember that coding tricks should be used like garlic
in cooking.←.
We still aren't out of the woods yet: if we wish to represent %340%*
as %b40%* then we still have a conflict with our proposed
representation of the atom %3A%* as %b40%*.
The next section
examines the question of what are reasonable representations for
literal atoms⊗↓or non-numeric atoms←.
.CENT(AMBIT/G)
.P158:
Before looking more deeply into the subconscious nature of atoms,
we should explore a possibility for describing LISP's primitives.
We have described them by example, without much loss in clarity.
It would be more
pleasing if we could describe each primitive in more general terms.
Fortunately we can give less machine-dependent characterizations
using a micro version of a graphical language called ⊗→AMBIT/G↔←⊗↓An
ambit is half aminal, half hobbit. AMBIT/G also is an acronym
for Algebraic Manipulation By Identity Transformation/Graphical.←.
When faced with explaining a complex structure-manipulating
program to someone we draw pictures. We do it in LISP to describe the data
structures and in this section we have given graphical descriptions
of the LISP primitives using examples. AMBIT is an extension of
both of these ideas; it
is a graphical language for the description of both
data and algorithms.
The basic statements of the language are %2pattern-match-and-replacement%*
rules.
Patterns are described as combinations of shapes and solid arrows.
If an instance of a pattern can be
found in the current state of the computation, then we will replace
that instance with a new pattern.
The only kind of replacement we will allow is the %2swinging%*
of an arrow so that its head moves from one node to another.
Thus the new pattern differs from the old only in the positioning
of some of the arrows.
Where the arrow head strikes a node is immaterial; the origin of the tail
%2is%* important.
Arrows which are to modified in the pattern matching are dotted.
.GROUP;
For example here's %3car%*⊗↓Portions
of the shapes marked with "?" are "don't-%3car%*e" conditions.←:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41 ∞*
⊂ααααααα⊃
~ # .~. → .
%ααααβαα$ .
~ .
↓ .
⊂αααπααα⊃ .
~ # ~ ? ~ ↓
%αβα∀ααα$ .
↓ .
⊂ααααααα⊃ .
~ ? ~← . .
%ααααααα$
.END
.BEGIN CENTER;SELECT 2;
algorithm for %3car%*
.END
.APART
This AMBIT diagram contains equivalent information to the previous
%2example%* of %3car%*, but the extraneous details of the which
addresses are involved have been stripped away.
We will use such diagrams occasionally when they will contribute to clarity.
.SS(Symbol tables: revisited,,P128:)
There are some rather gross defects in our symbol table mechanism.
First, we have had to resort to a certain subterfuge
to put the definitions of our functions in our symbol table.
Using %3label%* ({yonss(P38)}), or calling %3eval%* with an initial
symbol table, only defines the function for %2that%* particular call
on %3eval%*. When we finish the computation, the definition disappears.
From a practical point of view definitions should have some permanence.
An implemented version of LISP should know a
lot more functions than the five primitives. It should have a
reasonable class of arithmetic functions, and some commonly used
S-expr functions (like %3length, assoc, subst, equal,%* etc.) all predefined.
If we do things right we should be able to use the mechanism for
predefining functions as the tool for adding our own definitions.
Second, the table look-up mechanism of %3assoc%*, though adequate,
leaves something to be desired as far as efficiency is
concerned. Since we do wish to implement LISP, we should be concerned with
efficient operations. We will see that an efficient and powerful symbol table structure
can be described quite easily in LISP.
Third, we have already seen that special forms like %3QUOTE%* and %3COND%* are
not evaluated in call-by-value style.
Currently the recognizers for special forms are encoded in %3eval%*, and
when an instance of such a form is seen the argument is passed %2unevaluated%*.
It would be nice to extend this flexibility
to the user. We would like to have a notation for adding λ-definitions of
special forms to our symbol table.
%3eval%* would then need a way of distinguishing a
λ-expression defining a function from a λ-expression defining a special form.
Also there are other calling sequences which are interesting and useful and
would be nice to have.
The current version of %3eval%* only handles call-by-value definitions. Thus
the current symbol table need only store the λ-definition and does not
need explicit information saying it is such a function definition.
Fourth, conceptually we could use a more powerful mechanism. Consider
%3car[car]%*. Perhaps it is best just
to say the expression is ill-formed, but if we knew that %3car%* also
was currently being used as a simple variable besides being a function,
then our intuition says the first occurrence of %3car%* is a function
name and the second occurrence is the simple variable name; and if the
current value of %3car%* was say, %3(A.B)%* then %3car[car]%*
means apply the %2function%* %3car%* to the value of the %2variable%* %3car%*
and should therefore evaluate to %3A%*.
That is, context tells us which occurrence of %3car%* is a function
and which is a simple variable⊗↓just as an evaluator for %3prog%* can
tell a reference to the label %3x%* from the %3prog%*-variable %3x%* in
%3#...#prog[[x;y]#..#x#f[..]#...#g[x#..#]#...#go[x].%* See {yon(P83)}.←.
The current %3eval%* %2will%* operate correctly on this example since
a recognizer for the function %3car%* is explicitly encoded in %3apply%*⊗↓For
the same reason, the LISP obscenity %3λ[[lambda] ... ]%* will work.
Notice its S-expr representation is %3(LAMBDA#(LAMBDA)# ...#)%*←.
However, in general our current symbol table mechanism is not sufficient for
this interpretation. Forced to use %3assoc%*,
we would only find the %2first%* binding of a variable. It will either
be a λ-expression (function) or a simple value.
Clearly what is needed is the ability to
associate more than one property with an atom. In the above example
%3car%* has (at least) two %2properties%* or %2attributes%*. It has a function
value and it has a simple value. Since %3car%* is a primitive function,
its function value is the location of the machine code to carry out
the %3car%* function; the simple value is the constant currently bound to
the variable, %3car%*.
We could continue using an
%3assoc%*-like table, storing information
about what type of value is being represented. For example:
.BEGIN CENTER;SELECT 3;
((CAR . (VALUE (A . B))), (FACT .(LAMBDA_EXPR (LAMBDA (X) (COND ...))) )
.END
.TURN OFF "α";
Searching down such a table is expensive and our desire for
efficiency would certainly not be satisfied. Rather, we will
design a new symbol table which can store with each variable
sequences of values and
their types. This gives us multiple properties.
We will make the table %2global%*
to %3eval%* and %3apply%*; these functions will now implicitly refer to that table
and thus need not have an explicit symbol-table argument⊗↓This will cause
pains in handling of procedure-valued variables.←. This
gives us permanence.
We will designate an indicator for function definition and an indicator
for special form definition and expand %3eval%* and %3apply%* to recognize
and deal with these indicators. This will give us generality.
.P139:
The first hint on how to accomplish our aim comes from the realization that
the values which we are talking about are properties associated with
variables, and that our representation of variables will be atoms.
Thus we want to associate the properties with atoms⊗↓atoms are
frequently called %2carriers%*←; then any time we need to
know a value for a particular variable, we go the the corresponding atom and
look for the property. This scheme would obviate the current need to search
the association list of atoms and values; that search will typically be long
since the table can get large and our means of searching is very
inefficient⊗↓linear search←.
If we can find an efficient way of storing and
locating atoms, then we're in business.
Most implementations of LISP allow the association of
arbitrary sequences of %6attribute-value%* pairs with an atom.
With each atom we will
associate a list called the %6⊗→property-list↔←%* or p%6-list.%* But atoms
can't be represented as just any arbitrary list.
We must be able to recognize the occurrence of an atom so that we can
implement the predicate %3atom%*.
We are %2not%* going to continue atom space as a separate entity. For one reason
we will be doing a complete revision on the storage properties of atoms; it also
makes it a bit more difficult to draw meaningful pictures of representations of
atoms.
So atoms will be represented as special lists: the
%3car%* (or first element) of the representation of each atom is an
indicator used exclusively for the beginning of atoms.
We will use %b∃%* to designate the beginning of an atom.
The %3cdr%* of the representation of the atom is the property list.
Such locations in pointer space containing %b∃%* in their left-half and
a pointer to a p-list in their right-half are called %2⊗→atom header↔←s%*.
The elements of the p-list are associated in pairs. The first element
of a pair is
the attribute (or property or indicator); the next element is the value.
.GROUP
For example, here
is part of the atom-structure for %3car%* assuming %3car%* is also being
used as a simple variable and has current value, %3(A B)%*⊗↓In the diagram
%bAC%41%1 is used to reference the atom %3CAR%*; it is not part of the
atom.←:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
.TURN ON "?" FOR "\";
.TABS 52,57;
AC∞41∞*
⊂ααααα⊃
~ # ~
%ααβαα$
↓
⊂ααπαααα⊃ ⊂ααααααπαααα⊃ ⊂αααπααα⊃ ⊂αααααααπααα⊃ ⊂αααααπααα⊃
~∃ ~ #αβαα→~ SUBR ~ #αβ→~ # ~ #αβα→~ VALUE ~ #αβ→~ # ~ ...
%αα∀αααα$ %αααααα∀αααα$ %αβα∀ααα$ %ααααααα∀ααα$ %ααβαα∀ααα$
~ ~
to machine code ←ααααα$??↓
for ∞3car ∞* ?⊂αααααπααααα⊃ ⊂αααααπαα⊃
?~ A ~ #ααβα→~ B ~≤'~
?%ααααα∀ααααα$ %ααααα∀αα$
.END
.P50:
.BEGIN CENTER;
%2Part of the atom-structure for %3CAR%1
.END
.apart
The indicator, ⊗→%3VALUE%*↔←, tells us that %3car%* is
being used as a simple variable and has value %3(A,#B)%*.
The indicator, ⊗→%3SUBR%*↔←, tells us that %3car%* is a machine
language function. When we wish to use %3car%* as a function
the machine-dependent code will be executed.
.P29:
How about the representation of non-primitive functions?
We know that non-primitive functions are defined using λ-expressions;
we introduce another
indicator, ⊗→%3EXPR%*↔←, which will be used as the
attribute to which we attach a function defined as a λ-expression. For example,
we might define
.BEGIN CENTERIT;GROUP;
←%3fact <= λ[[x][x=0 → 1; %et%* → times[x;fact[sub1[x]]]]]
%1 and the S-expr translation of the right-hand side would be:
←%3(LAMBDA(X)(COND ((ZEROP X) 1) (T (TIMES X (FACT (SUB1 X))))))
.END
To represent the intention that %3fact%* is to be defined as
the above recursive function, we will store the S-expr representation
on the property-list of the atom %3FACT%* and use %3EXPR%*
as its indicator.
.GROUP
With this in mind,
here is part of the atom-structure for %3FACT%*:
.<<atom struct for fact>>
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41∞*
⊂ααααα⊃
~ # ~
%ααβαα$
↓
⊂ααααααπαααα⊃ ⊂αααπαααα⊃
~ EXPR ~ #αβαα→~ # ~ #αβ→ ....
%αααααα∀αααα$ %αβα∀αααα$
↓
⊂ααααααααπααα⊃ ⊂αααπααα⊃ ⊂αααπαα⊃
~ LAMBDA ~ #αβαα→~ # ~ #αβα→~ # ~≤'~
%αααααααα∀ααα$ %αβα∀ααα$ %αβα∀αα$
~ ↓
⊂αααααααααα$ ⊂ααααααπααα⊃ ⊂αααπααα⊃ ⊂αααπαα⊃
↓ ~ COND ~ #αβ→~ # ~ #αβ→~ # ~≤'~
⊂αααπαα⊃ %αααααα∀ααα$ %αβα∀ααα$ %αβα∀αα$
~ X ~≤'~ ↓ ↓
%ααα∀αα$ ... ...
.END
.BEGIN CENTER;
%2Atom-structure for %3FACT%1
.END
.APART
Keep in mind that we are storing the data structure
representation of the %3fact%* function thus
.BEGIN CENTERIT;
←%3(LAMBDA(X)(COND ((ZEROP X) 1) (T (TIMES X (FACT (SUB1 X))))))
.END
is a perfectly good list and therefore a data structure. If we attached it
to the indicator %3VALUE%* then we would have represented the list as the
%2simple%* value of %3fact%*. This obviously has a completly different meaning.
Primitive special forms like %3COND%* and %3QUOTE%* give rise to another
indicator. %3FSUBR%* will appear as an indicator on the
property lists of these atoms; when an instance of such a special form is
recognized, the argument list is passed to the primitive without
any evaluation.
In later sections of this chapter we will introduce other indicators
used in the implementation; for example, in {yonss(P18)} we will
present calling protocols other than
that expressed by %3EXPR%*'s.
The ability to define properties and manipulate property-lists
is a useful technique and is available to the LISP user.
.P142:
Both the LISP implementation and the user access the property-list with the same
functions. There are four such:
.BEGIN INDENT 0,15;GROUP
.p52:
%3putprop[x;v;i]:%*
⊗→%3putprop%*↔← will put the value %3v%* under the indicator %3i%* on the
property-list
of the atom %3x%*. If the indicator %3i%* already appears on the p-list then the
%3v%* over-writes the old value; otherwise a new attribute-value pair is added
to the front of the p-list of %3x%*. The value returned by %3putprop%* is %3v%*.
For example, %3putprop[CAR;(A#B);VALUE]%* has the effect of assigning the
simple value %3(A#B)%* to %3car%*; %3putprop[FACT;(LAMBDA(X)#...);EXPR]%*
will have the effect of defining %3fact%* to be a call-by-value function.
.END
.BEGIN INDENT 0,15;GROUP
%3get[x;i]: %*
⊗→%3get%*↔← will search the property-list of the atom %3x%* looking for the
indicator %3i%*. If %3i%* is found the value associated with that indicator is
returned by %3get%*.
If %3x%* does not have the
indicator then %3NIL%* is returned.
Thus %3get[FACT;VALUE]%* gives %3NIL%*, but %3get[FACT;EXPR]%* retrieves
the function definition.
%3get[CAR;VALUE]%* for %3CAR%* on {yon(P50)} will return %3(A#B)%*.
.END
.BEGIN INDENT 0,15;GROUP
%3getl[x;l]:%*
%3l%* is a list of indicators.
⊗→%3getl%*↔←
will search the property-list of the atom %3x%* for the %2first%* occurrence
of any indicator which appears in %3l%*. If such a match
is found, then the %2remainder%* of the p-list, beginning with the
matching indicator, is returned. If no match is found, %3NIL%* is
returned.
The virtue of %3getl%* is that it can distinguish between an atom which has
an indicator with value %3NIL%* and an atom which does not have the indicator.
%3get%* cannot communicate this distinction.
.END
.GROUP
An example:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
∞3getl[FOO;(BAZ)]∞*
⊗
∞3getl[FOO;(PNAME BAZ)]∞* ∞3get[FOO;PNAME]∞*
↓ ↓
⊂ααα←ααααααα$ ~
⊂απααα⊃ ↓ ⊂αααααπααα⊃ ⊂αααααπααα⊃ ⊂αααααααπααα⊃ ⊂αααπαα⊃ ~
~∃~ #αβαα→~ BAZ ~ #αβα→~ NIL ~ #αβα→~ PNAME ~ #αβα→~ # ~≤'~ ↓
%α∀ααα$ %ααααα∀ααα$ %ααααα∀ααα$ %ααααααα∀ααα$ %αβα∀αα$ ~
ε←ααα←ααα$
↓
⊂αααπαα⊃
~ # ~≤'~
%αβα∀αα$
↓
⊂ααααα⊃
~FOO≡≡~
%ααααα$
.END
and %3get[FOO;BAZ] = get[FOO;BAR] = NIL%*.
.APART
.BEGIN INDENT 0,15;
%3remprop[n;i]%1: The final function in this class is used to remove
attribute-value pairs
from the p-list of an atom. The function is named ⊗→%3remprop%*↔←.
%3remprop%*
has two arguments, %3n%*, an atom, and %3i%*, an indicator. If the indicator is found on the
p-list of the atom, then %3remprop%* removes the indicator-value pair and returns
%3T%*; otherwise it returns %3NIL%*.
Thus %3remprop[FACT;EXPR]%* effectively removes the function definition
from the atom %3FACT%*.
.END
The exact definition of these functions will have to wait until {yonss(P155)}.
So the simple atom is becoming much more complex. It has a whole substructure
attached to it. Thus each atom is like a word in a dictionary; many words can
be used as different parts of speech and their dictionary entries will
reflect this by having several alternative meanings. So too with atoms
in LISP; an atom will typically have several different "meanings" attached to it
and depending on the context, we will be interested in one of those interpretations.
Just as we will find all meanings of a specific word in one location in the
dictionary, our implementation of LISP
becomes much simpler if we store each atom and its associated p-list uniquely.
For example,
every reference to the atom %3A%* is actually a pointer to the same
location in memory. This location has a %3car%*-part which is the special
atom indicator %b∃%*,
and a %3cdr%*-part which is the ⊗→p-list↔← for the atom %3A%*.
.GROUP
Thus %3(A . A)%*, which we have been representing as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂αααααπααααα⊃
~ A ~ A ~
%ααααα∀ααααα$
.END
is more realistically represented as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂αααααπααααα⊃
~ # ~ # ~
%ααβαα∀ααβαα$
~ ~
~ ~ ⊂απααα⊃
%ααααα∀α→~∃~ #αβα→ ∞1p-list for ∞3A∞b.
%α∀ααα$
.END
.APART
.P48:
So the internal structures of this implementation of LISP are %2not%* L-trees;
there are intersecting branches. Structures of this more general sort
are called %2⊗→list structure↔←%*. L-trees are special instances of list
structure.
LISP deals with binary list structure since each non-terminal node in our
representation has exactly two branches.
We will see in a moment that many of the
computations which we have been performing have also been generating
list structure.
Question: Assume we have the above dotted pair as a value of a variable %3x%*
and we wish to print the value of %3x%*.
Clearly we would hope to see "%3(A . A)%*" appear on the output device.
We would expect that the print routine, named %3print%*, would be given
a pointer to the dotted pair. %3print%* can recognize that it %2is%* a dotted
pair since its %3car%* is not %B∃%*.
But how can %3print%* distinguish %3(A . A)%* from %3(B . B)%* for example.
The pointer in the preceding diagram will point to %3A%* in one case
and to %3B%* in the other, but nothing on the p-list of the atom tells us %2what%*
to print. The simplest thing to do is to store a representation of the name
on each p-list.
This is done with another indicator called ⊗→%3PNAME%*↔←, standing for
%2⊗→print-name↔←%*. Each atom is guaranteed to have a print-name or "p-name".
Thus the atom %3BAZ%* will have at least the following:
.GROUP
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41∞*
⊂ααααα⊃
~ # ~
%ααβαα$
↓
⊂αααααααπαααα⊃ ⊂αααπαααα⊃
~ PNAME ~ #αβαα→~ # ~ #αβ→ ....
%ααααααα∀αααα$ %αβα∀αααα$
↓
⊂αααπαα⊃
~ # ~≤'~
%αβα∀αα$
↓
⊂ααααααα⊃
~ BAZ≡≡ ~
%ααααααα$
.END
←%2Atom-structure for %3BAZ%1.
.APART
The indicator is called %3PNAME%* because the print-name (or ⊗→p-name↔←)
of the atom is what the LISP output routine will print as the name of
the atom. Indeed, the %2only%* time we will need the atom's name is when we wish
to print an S-expr containing that atom.
%bBAZ≡≡%1 means a memory location
containing some encoding of the letters %bB%*, %bA%*, and %bZ%*.
The symbol, %b≡%*, represents some non-printing character; thus we
are assuming a location can contain five characters.
We represent the print-name as a list so that we may allow atoms with
p-names of unbounded length. Thus the p-name for %3BLETCH%*,
would be:
.GROUP
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41∞*
⊂ααααα⊃
~ # ~
%ααβαα$
↓
⊂αααααααπαααα⊃ ⊂αααπαααα⊃
~ PNAME ~ #αβαα→~ # ~ #αβ→ ....
%ααααααα∀αααα$ %αβα∀αααα$
↓
⊂αααπααα⊃ ⊂αααπαα⊃
~ # ~ #αβαααα→~ # ~≤'~
%αβα∀ααα$ %αβα∀αα$
↓ ↓
⊂ααααααα⊃ ⊂ααααααα⊃
~ BLETC ~ ~ H≡≡≡≡ ~
%ααααααα$ %ααααααα$
.END
←%2P-name structure for %3BLETCH%1.
.APART
Notice that the actual print-names %bBAZ≡≡%*, %bBLETC%*, and %bH≡≡≡≡%*, are
candidates for storage in atom space since these encodings should not
be interpreted as pointers.
With such print-names on each property-list
%3print%* can now function: it will search the p-list for the indicator %3PNAME%*
and print what it finds. For the details of LISP output see {yonss(P13)}.
How do we get atoms stored uniquely?
Since all of our LISP programs must be read into the memory we let the
input function named ⊗→%3read%*↔← keep track of these details.
On reading
an atom name (which is the way almost all atoms come into existence),
the %3read%* program checks the current symbol table. If the atom does
not appear we add a new symbol table entry consisting of the p-list with a single
attribute-value pair ----#%3PNAME%*,#p-name#---. If the atom %2does%*
appear, we bring back a pointer to the appropriate entry.
In summary, we have discussed the details of a typical implementation of the
class of Symbolic Expressions. Our non-atomic S-exprs have their branching structure
stored in what we have called pointer space. Our initial discussion
of atoms supposed a particular simple representation: simply store the
encoding of the atom in memory location found in a separate space which we called
atom space. Upon further reflection we decided that atoms should play
a more active role in the implementation. Since variables are to be represented as
atoms we needed some way to represent those properties typically associated with
variables. Variables in LISP are, among other things, used for names of
functions and names for simple S-expr values. Thus we needed the ability
to represent at least these two kinds of "values" with a LISP atom.
We introduced the general scheme of property-lists and associated
such a p-list which each LISP atom. All the things we know about a specific
atom are stored on the p-list. Thus we wanted each atom stored uniquely;
then anyone who wanted to examine the properties of an atom would
only have to look at the p-list. The burden of keeping unique storage was left
up to the input program. The effect of storing atoms uniquely was to turn
our abstract LISP-trees into list structure. That is, there are intersections
in the structures. Indeed, the representation of a LISP expression is a
complex net of intersecting pointers; even atoms are now pointers. The only
LISP objects we now have which are %2not%* pointers are the actual print-names
like %bBAZ≡≡%*. Since atoms aren't really "atomic" but have most of their
representation in pointer space, it is more realistic to
introduce a new name for that area. We will call that area
of memory which contains information %2not%* to be interpreted as
pointers, %2⊗→Full Word Space↔←%*.
To reinforce our discussion we illustrate on the next page the
representation of the atom %3NIL%*. In all of the resulting worms
there are only three elements in Full Word Space; everything else is a
pointer.
.<<pic of NIL>>
.SKIP TO COLUMN 1;
.SS(A picture of the atom %3NIL%*,%3NIL%*)
.P141:
We have been writing the atom %3NIL%* as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπααα⊃ ⊂αααααααπααα⊃ ⊂αααπαα⊃
~∃~ #αβα→~ VALUE ~ #αβα→~ # ~ #αβα→~ PNAME ~ #αβα→~ # ~≤'~
%α∀ααα$ %ααααααα∀ααα$ %αβα∀ααα$ %ααααααα∀ααα$ %αβα∀αα$
↓ ~ ⊂αααπαα⊃
NIL %α→~ # ~≤'~
%αβα∀αα$
↓
⊂ααααα⊃
~NIL≡≡~
%ααααα$
.END
Where the atoms for %3PNAME%* and %3VALUE%* are represented as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπαα⊃ ⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπαα⊃
~∃~ #αβα→~ PNAME ~ #αβα→~ # ~≤'~ ~∃~ #αβα→~ PNAME ~ #αβα→~ # ~≤'~
%α∀ααα$ %ααααααα∀ααα$ %αβα∀αα$ %α∀ααα$ %ααααααα∀ααα$ %αβα∀αα$
↓ ↓
⊂αααπαα⊃ ⊂αααπαα⊃
~ # ~≤'~ ~ # ~≤'~
%αβα∀αα$ %αβα∀αα$
↓ ↓
⊂ααααα⊃ ⊂ααααα⊃
~PNAME~ ~VALUE~
%ααααα$ %ααααα$
.END
More realistically we should represent %3NIL%* as:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂ααααααααααααααααααα←ααααααααααααααααααααααααααα←ααααααπααααπααααααααπααα⊃
↓ ↑ ~ ↑ ↑
⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπααα⊃ ⊂αααααααπααα⊃ ⊂αααπαβα⊃ ↑ ~ ~
~∃~ #αβα→~ # ~ #αβα→~ # ~ #αβα→~ # ~ #αβα→~ # ~ # ~ ~ ~ ~
%α∀ααα$ %αααβααα∀ααα$ %αβα∀ααα$ %αααβααα∀ααα$ %αβα∀ααα$ ~ ~ ~
↑ %αααααααααα→ ~ →αααααααα→ ~→αα⊃ ~ ↑ ~ ~
~ ↓ ~ ~ ~ ⊂αααπαβα⊃ ~ ~
~ ~ ↓ ↓ %α→~ # ~ # ~ ~ ~
~ ↓ ~ ~ %αβα∀ααα$ ~ ~
~ ~ ~ ~ ↓ ↑ ↑
%ααααααα←ααααααααα←αααααα∀αααπαααα⊃ ~ ~ ⊂ααααα⊃ ~ ~
↑ ↑ ~ ~ ~NIL≡≡~ ~ ~
~ ~ ↓ ~ %ααααα$ ~ ~
⊂αααααααααα←ααααααααααααααα← ~ ←α ~ ←α$ ~ ~ ~
↓ ~ ~ ↓ ~ ~
⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπαβα⊃ ~ ⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπαβα⊃ ↑
~∃~ #αβα→~ # ~ #αβα→~ # ~ # ~ ↑ ~∃~ #αβα→~ # ~ #αβα→~ # ~ # ~ ~
%α∀ααα$ %αααβααα∀ααα$ %αβα∀ααα$ ~ %α∀ααα$ %αααβααα∀ααα$ %αβα∀ααα$ ~
↑ ~ ↓ ~ ~ ↓ ~
εααααα←ααααα$ ⊂αααπααα⊃ ↑ ↓ ⊂αααπααα⊃↑
~ ~ # ~ #αβα$ ~ ~ # ~ #αβ$
~ %αβα∀ααα$ ~ %αβα∀ααα$
~ ↓ ~ ↓
↑ ⊂ααααα⊃ ~ ⊂ααααα⊃
~ ~PNAME~ ↓ ~VALUE~
~ %ααααα$ ~ %ααααα$
%αααααααα←ααααααααααααα←ααααααααααααα←αααααααααααααα$
.END
.SS(A first look at %3cons%1)
%1
We have discussed the implementation possibilities for all of the
LISP primitive operations %2except%* %3cons%*. We will remedy this
oversight now.
The effect of the ⊗→%3cons%*↔← function is quite different from that
of the other LISP primitives. The other functions (or
predicates) manipulate existing S-expressions, whereas %3cons%*
must construct a new S-expression from two existing S-exprs.
That is, given representations of two S-exprs, say %3x%* and %3y, cons[x;y]%*
is to get a new cell, put the representation of %3x%* in the %3car%*-part of
the cell and the representation of %3y%* in the %3cdr%*-part:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
∞1result of ∞3cons[x;y]∞B
~
~ ⊂αααααπααααα⊃
%ααααααααα→~ # ~ # ~
%ααβαα∀ααβαα$
~ ~
⊂αααα$ %αααα⊃
~ ~
↓ ↓
∞1rep. of ∞3x ∞1rep. of ∞3y∞1
.APART
.END
Where is %3cons%* to obtain these new cells? This
materialization is accomplished by the ⊗→free space list↔←.
When a computation
is begun, only the atom structure for the initial LISP symbol table
uses cells in the pointer-area. The remaining pointer-cells
are linked together and form the %2⊗→free-space list↔←%* or FS list.
Whenever a %3cons%* needs a cell the first cell in the FS list is used and
the FS list is set to the %3rest%* of the FS list. For example the following
represents the effect of %3cons[A;B]%*:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41∞* AC∞42∞* FS list
⊂ααααα⊃ ⊂ααααα⊃ ⊂ααααα⊃
~ # ~ ~ # ~ ~ # ~
%ααβαα$ %ααβαα$ %ααβαα$
↓ ↓ ↓
⊂απααα⊃ ⊂απααα⊃ ⊂ααπααα⊃ ⊂ααπααα⊃ ⊂ααπαα⊃
~∃~ #αβ→... ~∃~ #αβ→ ... ~≤'~ #αβ→~≤'~ #αβ→ ...→~≤'~≤'~
%α∀ααα$ %α∀ααα$ %αα∀ααα$ %αα∀ααα$ %αα∀αα$
atom A atom B
.BEGIN CENTER
∞2Before∞1
.END
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41∞* AC∞42∞* FS list
⊂ααααα⊃ ⊂ααααα⊃ ⊂ααααα⊃
~ # ~ ~ # ~ ~ #ααβααα⊃
%ααβαα$ %ααααα$ %ααααα$ ~
%ααααααααα→ααααααααααααα→ααααααα⊃ ~
↓ ↓
⊂αααπααα⊃ ⊂ααπααα⊃ ⊂ααπαα⊃
~ # ~ # ~ ~≤'~ #αβ→ ...→~≤'~≤'~
%αβα∀ααα$ %αα∀ααα$ %αα∀αα$
↓ ↓
⊂ααααααα←αααααααααααα←αααααααα$ ~
~ ⊂αααααααα←αααααααααα$
↓ ↓
⊂απααα⊃ ⊂απααα⊃
~∃~ #αβ→... ~∃~ #αβ→ ...
%α∀ααα$ %α∀ααα$
atom A atom B
.BEGIN CENTER
∞2After∞1
.END
.END
As the computation continues, cells are taken from the FS list.
When a %3cons%* operation asks for a cell and the FS list is empty, the
computation is suspended and the %2⊗→storage reclaimer↔←%* or %2⊗→garbage collector↔←%* is
called. The job of the garbage collector is to locate cells for a new
free space list.
.SS(Storage management: garbage collection,garbage collection,P146:)
During the course of a computation, contents of cells which were taken
from the FS list often become unnecessary. For example during the evaluation
of something as simple as:
←%3(CONS (QUOTE A)(QUOTE B)),%* many cells are used:
.GROUP;
%21.%* At least seven cells are needed just to read in the expression:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂ααααααπααα⊃ ⊂αααπααα⊃ ⊂αααπαα⊃
~ CONS ~ #αβα→~ # ~ #αβα→~ # ~≤'~
%αααααα∀ααα$ %αβα∀ααα$ %αβα∀αα$
↓ ↓ ⊂αααααααπααα⊃ ⊂αααπαα⊃
~ %→~ QUOTE ~ #αβα→~ B ~≤'~
~ %ααααααα∀ααα$ %ααα∀αα$
~ ⊂αααααααπααα⊃ ⊂αααπαα⊃
%α→~ QUOTE ~ #αβα→~ A ~≤'~
%ααααααα∀ααα$ %ααα∀αα$
.END
.APART
If some of the atoms are not present in the symbol table,
more cells will be needed.
%22.%* One cell will be needed to perform the %3cons%* operation. See
the previous example.
After the computation is completed, none of the eight mentioned cells are needed.
They are garbage.
Why not simply return these `garbage cells' explicitly to the FS list?
Frequently it is very difficult to know just exactly which cells
%6are%* garbage. Indeed, experiments have been performed in which LISP
programmers were allowed to return `garbage' to the FS list themselves.
The results were disastrous; list structure, thought to be garbage
was returned to the FS list by the programmers, unaware it was still being
used by other computations. We will see where these difficulties arise
later.
Thus reclamation is passed to the LISP system. The %3cons%* function and
its FW space counterpart, %3fwcons%*, are the only functions allowed
to mainpulate the free storage lists. When either list becomes empty, the
garbage collector is called.
%2←The fundamental assumption behind garbage collection is:%*
.NARROW 10,10;
At any point in a LISP computation, all cells which contain
parts of the computation are reachable (by %3car-cdr%* chains)
through a fixed set of known cells or registers.
.WIDEN;
The first phase of the garbage collector, called the %2⊗→marking phase↔←%*,
marks all of the list structure which is currently active (reachable
through the above mentioned fixed cells). This should include all the
atoms in the symbol table and all the cells reachable by %3car-cdr%* chains
from any of these atoms. Also any partial computations which have been
generated must also be marked. Once all the active structure has been
marked, we proceed to the second phase, the %2⊗→sweep phase↔←.%*
Once the marking has been completed, we go linearly (sweep) through memory,
collecting all those cells which have not been marked. Again,
the assumption is that if cells are not marked in the first phase,
then they do not contain information necessary to the LISP computation.
These unmarked cells are chained together via their %3cdr%*-parts to form
a new FS list. The FS pointer is set to the beginning of this list;
similarly, the unmarked cells in Full Word Space comprise the new FW list.
.CENT(A simple LISP garbage collector written in LISP)
We will now write a garbage collector in LISP to mark and sweep nodes
in FS and FWS.
More complex algorithms will be discussed in {yonss(P27)} and on {yon(P144)}.
The present algorithm will have three main functions:
.BEGIN INDENT 0,10;
%3initialize[x;y]%* to initialize the marking device for each cell in the space from
%3x%* to %3y%*.
%3mark[l]%* to do the actual marking of a list %3l%*.
%3sweep[x;y]%* to collect all inaccessible cells in the space delimited
by %3x%* and %3y%*.
.END
%3initialize%* will be called twice; once for FS and once for FWS.
%3mark%* will be called for each base register which points to
active (binary) list structure.
The basic structure of the marker is: if the word is in FWS mark it and leave;
if the word has already been marked simply leave
since we are assured that any cells further down the structure
have already been marked. Otherwise the word is
in FS and thus has a %3car%* and a %3cdr%*; mark the word; mark the %3car%*;
mark the %3cdr%*. The marker is thus recursive.
%3sweep%* will be called twice; once to generate a new free space list and once
to generate a new full word space list. Elements of these free
lists will be chained together by their %3cdr%* parts.
The initialization and sweep phases of this garbage collector are very similar;
both phases must be assured of reaching every node in the space.
These main functions use several other functions and predicates:
.BEGIN INDENT 0,10;
%3fwswrdp[x]%* is true just in the case that %3x%* is a word in FWS. This is
used as one of the termination conditions of %3mark%*.
%3makeA[x]%* marks word %3x%* as accessible.
%3makeNA[x]%* marks word %3x%* as not accessible.
%3NAp[x]%* is true if word %3x%* is marked "not accessible".
%3down[x]%*: If %3x%* is at location %bn%* then %3down[x]%* is location %bn+1%*.
%3conca[x;y]%* modifies %3x%* by attaching it to the front of the list %3y%*.
The value returned is the new list.
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
AC∞41 ∞* AC∞41 ∞*
⊂ααααααα⊃ ⊂ααααααα⊃
~ 102 ~ ~ 102~
%ααααααα$ %ααααααα$
AC∞42 ∞*
⊂ααααααα⊃
~ 204 ~
%ααααααα$
∞1==∞3conca∞1=>∞b
. . .
⊂αααπααα⊃ ⊂αααπααα⊃
102 ~222~472~ 102 ~222~204~
%ααα∀ααα$ %ααα∀ααα$
. . .
⊂αααπααα⊃
204 ~363~122~
%ααα∀ααα$
.BEGIN CENTER;SELECT 2;
Algorithm for ∞3conca∞1
.END
.END
Can you write %3conca%* as a LISP function?
.BEGIN SELECT 3;TABIT2(18,20);TURN OFF "←";
.GROUP
initialize <= λ[[x;y]
\prog[[]
\ a\makeNA[x];
\\x ← down[x];
\\[eq[x;y] → return[T]]
\\go [a]]]
.APART;
.GROUP;
mark <= λ[[l] [\not[NAp[l]] → T;
\fwswrdp[l] → makeA[l];
\%et%* → makeA[l];mark[car[l]];mark[cdr[l]] ]]
.APART
.GROUP
sweep <= λ[[x;y]
\prog[[z]
\\z ← ();
\ a\[NAp[x] → z← conca[ x;z]];
\\x ← down[x];
\\[eq[x;y] → return [z]];
\\go[a]]]
.END
Garbage collection appears to be a very complex and time-consuming
process. Indeed it is. As indicated above, there are alternatives in some
applications. If the data-structure manipulations are particularly simple
then leaving storage management in the programmer's hands might
suffice. However, LISP generates very intertwined structures.
.P145:
Perhaps there is
another way to allow the system to handle storage management.
First notice that storage management becomes quite simple if there is no sharing
of sublists. The question of when to return a list to the free space list
is easily solved. However it is desirable to share substructure quite often.
Sharing can obviously save space and careful modification of shared structures
can communicate global information between algorithms.
Instead of using a garbage collector, we might associate a counter, called
a %2⊗→reference counter↔←%*, with each
list when it is built. In that counter we will store the number of references
to that list. Thus the counter will be initialized to 1 when the list is created.
Whenever the list
is shared we increase the counter by 1; whenever the list is no longer to
be shared by some list we decrease the counter by 1.
When the count goes to 0 we can put the cells of the list in the free space list.
One of the most glaring defects in this reference counter scheme is the
prohibition of circular lists. Consider the following sequence:
.BEGIN NOFILL;GROUP;TURN OFF"←";
%21.%* Manufacture a list,%3x%*: %3x ← (B I G M O T H E R L I S T)%1. Reference count is 1.
%22.%* Circularize it: %3x ← circle[x];%*. Reference count is now 2.
%23.%* Delete all references to %3x%*: %3x ← NIL%*. Reference count reverts to 1,
but the list is not referred to, is not on the free space list, and has thus
been lost to the system.
.END
****MUCH MORE ON REF COUNTER***
.CENT(Problems)
.P154:
I This problem deals with what is known in LISP as %2⊗→hash consing↔←%*.
We have been storing atoms uniquely, but it should be clear from the behavior
of %3cons%* that non-atomic S-exprs are %2not%* stored uniquely.
Certainly storing single copies of any S-expr would save space. For example,
the non-atomic structure of
%3((A . B).(A . B))%* could be represented with two cells rather than three.
Unique storage is not without its difficulties. What problems do you forsee
in implementing such a scheme?
II We said on {yon(P48)} that many LISP computations generate list structure
rather than true LISP-trees? Give an example.
.P55:
II Can you write a LISP function %3circle <= λ[[x] ... %* which will take
a list %3x%* and make it circular. Thus:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂ααααααααααααα←ααααααααααααα⊃
⊂αααααααααπααα⊃ ⊂αααααπααα⊃ ↓ ⊂ααααπααα⊃ ⊂αααααααπααα⊃ ↑
~ NOTHING ~ #αβα→~ CAN ~ #αβα∀→~ GO ~ #αβα→~ WRONG ~ #αβ→$
%ααααααααα∀ααα$ %ααααα∀ααα$ %αααα∀ααα$ %ααααααα∀ααα$
.END
This list is circular on its "last two" elements.
Needless to say printing such structures is not appreciated.
III What LISP operations generate such intertwined structures that a reference
counter implementation would be infeasible?
.SS(Input/Output: %3read%* and %3print%*,,P13:)
.TURN ON "←";
When you begin to implement LISP you find that a very significant
part of LISP can be written in LISP. We have already seen that the
evaluation process is expressible this way.
Here we will see that the majority of the I/O routines can be
written as LISP functions calling a very few primitive routines.
The primitive routines can also be described in `pseudo-LISP'.
That is, we will give a LISP-like description of them, though
they would normally be coded in machine language.
The primitive functions are ⊗→%3ratom%*↔← and ⊗→%3prin1%*↔←.
.BEGIN INDENT 0,10;
.GROUP
%3ratom[ ]%* is a function of no arguments. It reads the next atom
or special character (left paren, right paren or dot).
It looks up that object in the symbol table and returns
a pointer to that symbol table entry. If no entry
is found an appropriate entry is made.
%3ratom%* flushes spaces and commas,
recognizing them as delimiters. It returns only atoms or special characters
to %3read%*.
.APART
.GROUP
%3prin1[x]%* is a function of one argument expecting an atom, left paren
right paren, blank, or dot as the value of its argument.
It will print the p-name of that object on the output
device.
.APART
.END
%3ratom%* is the LISP %2⊗→scanner↔←%*. A scanner is a basic part of
an input processor. Typically
a scanner's job is to negotiate with the actual input device for input
characters. The scanner builds the most basic ingredients, like identifiers
from alphanumeric strings, or numbers from digit strings, and only after
such a basic block has been recognized is the next level of syntax
analysis attempted.
The units, also called tokens, which the scanner has build are passed up to the
%2⊗→parser↔←%*.
The typical parser builds a tree-representation of the input string;
the structure of the tree reflects the language constructs which where present in the
input.
A well-designed parser uses the BNF equations which
describe the class of well-formed input expressions to determine whether or not
the input stream is a legal expression.
%3read%* is the LISP ⊗→parser↔←.
%3read%*
will recognize both S-expression and list-notation;
besides analyzing the input stream, %3read%*
also builds an S-expression representation of the input.
To make life simpler we need to be able to refer to atoms whose print-names are
the characters "%3)%*", "%3(%*", ".", and " "(blank).
We will assume that ⊗→%3RPAR%*↔←, ⊗→%3LPAR%*↔←,
⊗→%3PERIOD%*↔←, and ⊗→%3BLANK%*↔← are such atoms. For example, if the next input
character is "(" then
←%3eq[ratom[ ];LPAR]%* is true (and the input pointer is moved to
the next character!).
%3prin1[PERIOD]%* will have the effect of printing a %2"."%* on the output
device.
We will discuss the structure of %3ratom%* and %3prin1%* in a moment. In
the meantime here is %3read%*:
.BEGIN TABIT2(17,20);TURN OFF"←";SELECT 3;
.GROUP
⊗→%3read%*↔← <=λ[[ ]prog[[j]
\j ← ratom[ ];
\[atom[j] →return[j];
\ is_lpar[j] → return[read_head[ ]];
\ is_dot[j]] → err[ ];
\ is_rpar[j] → err[ ];
\ ]]]
.APART
.BEGIN FILL;select 1;
%3read%* will return a representation of a legal S-expr or list, %9α%*.
%3err%* is a LISP system function which, in this case, will terminate the input scanning
immediately.
%3read_head%* will translate strings %9α%*, acceptable in the context "%3(%9α%1".
Thus %9α%* of %3A)%* or %3A#(B#.#C))%* would be suitable for %3read_head%*;
%3(A)%* and %3(A#(B#.#C))%* are S-exprs or lists.
%3.#A)%1 would not be acceptable since %3(.#A)%1 is not an S-expr or list.
.END
.GROUP
⊗→%3read_head%*↔← <= λ[[ ]prog[[j]
\\j ← ratom[ ];
\\[atom[j] → return[cons[j;read_tail[]]];
\\ is_lpar[j] → return[cons[read_head[ ];read_tail[ ]]];
\\ is_dot[j] → err[ ];
\\ is_rpar[j] → return[NIL];
\\]]]
.APART
.GROUP;
.BEGIN FILL;SELECT 1;
%3read_head%* is looking for legal %9α%*'s in the context "%3(%9α%1".
If it sees:
.END
.BEGIN NOFILL;SELECT 1;
\an atom, then %9α%1 is <atom>%9β%3)%1;
\a left parenthesis, then %9α%1 is %3(%9β%3)%9∂%3)%1;
\a dot, then %9α%1 is %3.%9β%3)%1;
\a right parenthesis, then %9α%1 is %3)%1;
.END
.APART
.GROUP
⊗→%3read_tail%*↔← <= λ[[ ]prog[[j]
\\j ← ratom[ ];
\\[atom[j] → return[cons[j;read_tail[]]];
\\ is_lpar[j] → return[cons[read_head[ ];read_tail[ ]]];
\\ is_dot[j] → return[read_cdr[]];
\\ is_rpar[j] → return[NIL];
\\]]]
.APART
.GROUP;
.BEGIN FILL;SELECT 1;
%3read_tail%* is looking for legal %9α%*'s in the context "%3(%1<sexpr>#%9α%1".
The structure of this function is that of %3read_head%* except for recognition
of dots. "%3.%9β%3)%1" %2is%1 plausible in the context "%3(%1<sexpr>#%9α%1".
It is up to %3read_cdr%* to see if its expectations are fulfilled.
.END
.APART
.GROUP;
%3read_cdr <= λ[[ ] prog[[j]
\\j ← read[ ];
\\[is_rpar[ratom[ ]] → return [j];
\\ %et%* → err[ ]
\\]]]
.BEGIN FILL;SELECT 1;
That is, the only input legal after a dot is a S-expr or list followed by a right
parenthesis.
.END
.APART
.CENTER;
.SELECT 3;
is_dot[x] <= eq[x;PERIOD] is_lpar[x] <= eq[x;LPAR], ...%1 etc.
.END
Here's %3print%* and friends. %3terpri%* gets a new output line.
(note: the value of %3print[x]%* is %3x%*.)
.BEGIN TABIT2(10,13);TURN OFF "←";SELECT 3;
.GROUP
⊗→%3print%*↔← <= λ[[x]prog[[ ]
\prin0[x];
\terpri[ ];
\return[x]]]
.APART
.GROUP
⊗→%3prin0%*↔← <= λ[[x]prog[[j]
\\[atom[x] → return[prin1[x]]];
\\j ← x;
\\prin1[LPAR];
\a3\prin0[car[j]];
\\[null[cdr[j]] → prin1[RPAR];return[x]];
\\prin1[BLANK];
\\[atom[cdr[j]] → prin1[PERIOD];prin1[BLANK];prin1[cdr[j]];prin1[RPAR];return[x]];
\\j ← cdr[j];
\\go[a3]
\\]]
.BEGIN FILL;SELECT 1;
Notice that we have used the extended conditional expression as described
in {yonss(P67)}. Notice too that %3prin0[(A#.(B#.#C))]%* prints as
%3(A#(B#.#C))%1.
.END
.APART
.END
So far we have thrown all the I/O back on %3ratom%* and %3prin1%*. Clearly
%3ratom%* will be more interesting. All %3prin1%* need do is get the p-name and
print it. %3ratom%* needs to do an efficient search of the symbol table and if
the atom is not found, add it to the table. All %3ratom%* has to work
with is the actual character string (called %3chr-str%* in the future) which
will be the p-name of some atom. What %3ratom%* could do is look at the
p-name of each atom currently in the symbol table; when it finds a match
it returns a pointer to the beginning of that atom.
(Notice this is essentially the search scheme of %3assoc%*.)
If the appropriate
atom is not found it can build a new one consisting of the p-name, add it
to the table, and return a pointer to this new atom. This would
have the appropriate effect; each atom would be stored uniquely, but
the efficiency would leave something to be desired.
We introduce a better searching scheme called hashing.
.CENT(Problems)
%21.%* Let %3prin_fw[x]%* be a primitive printing function where %3x%* is to
be an actual element in Full Word Space. %3prin_fw%* is to print the
character string on the output device. You may assume it knows about
%b≡%*. Write %3prin1%* as a function over %3prin_fw%*.
%22.%* You might have noticed that the definitions of %3read_head%* and
%3read_tail%*
are almost identical: the difference involves treatment of dots.
Write new versions of these functions utilizing a common routine and functional arguments.
%23.%* Write the set of BNF equations that are driving the parser, %3read%*.
.SS(Symbol table searching: Hashing,hashing,P14:)
Symbol table lookup is exactly the problem of looking up
words in a dictionary. The scheme of %3assoc%*
⊗↓called linear or lexicographical search←
is analogous to a person beginning at page one of the dictionary and
proceeding linearly (page-by-page and word-by-word) through
the book until he found the word in question. Truly a losing
scheme. What we normally do is look at the first character of
the word and go immediately to the subsection of the dictionary which
has the words beginning with that character. We know that if
we cannot find the definition of our word in that subsection we need
look no further in the book. Usually we delimit our search even
further by keying on subsequent characters in the word. Finally
we may resort to linear search to pin down the word on a specific
page or column.
What we want is a similar scheme for the machine. We might in fact
mimic the dictionary search, subdividing our table into 26 subsections.
We can do better.
Since it is the machine which will subdivide and
index into the symbol table, we can pick a scheme which is computationally
convenient for the machine. Besides being convenient, the scheme should
result in rather even distribution of atoms in the subsections.
Obviously if the majority of the atoms end up in the same partition
of the table we will have gained little towards improving the
search efficiency. Now for terminology. Each of the partitions
or subdivisions of the table is called a %2⊗→bucket↔←%*. Each atom will
appear in at most one bucket. The computational scheme or function
used to determine which bucket a particular atom belongs in is called
a %2⊗→hashing function↔←%*. All ⊗→%3ratom%*↔← has to work with is %3chr-str%*, the
encoding of the actual name of the atom. So the hashing function
will use %3chr-str%* to determine which bucket to examine. If the atom
with print-name %3chr-str%* does not appear in that bucket we are assured
that it does not appear anywhere in the table. In this case we
make up the minimal table entry --#atom#indicator, %3PNAME%*, p-name structure#--
and add it to that bucket.
There are lots of hashing functions. Here's one:
%21.%* Assume that we have N+1 buckets, numbered 0, 1, 2 ... N.
%22.%* Take the representation of %3chr-str%* (it's a number) and
divide that number by N+1.
%23.%* Look at the remainder. It's a number between 0 and N.
%24.%* Use that remainder as the index to the appropriate bucket.
This is a scheme frequently used in LISP. Given the bucket number,
we then run down the list of atoms in that bucket, comparing
print-names against %3chr-str%*. If the atom is not found, %3cons%*-up
the atom and stick it in the bucket.
There is a lot of mystery and history about hashing and other means
of searching and storing in symbols. Books by D. Gries or D. Knuth
cover searching and sorting techniques in greater detail.
In review, here is the structure of the LISP input mechanism:
%21.%* %3read%*, the LISP ⊗→parser↔←, builds a list-structure representation
of the input string. %3read%* is defined in terms of %3ratom%*.
%22.%* %3ratom%*, the LISP ⊗→scanner↔←, builds atoms and special characters
from the input. It searchs and increments the LISP symbol table.
%23.%* The LISP symbol table, usually called ⊗→%3OBLIST%*↔← (for %2⊗→object list↔←%*),
is a list of buckets. Each bucket is a list of the atoms which `hash' to that bucket.
We actually represent the object list as an array named %3oblist%*.
Arrays are discussed in full in {yonss(P137)}, but basically are an efficient
storage representation for sequences of known length. We typically allocate
a block of sequential cells and use the addressing scheme of the machine's hardware
to do a rapid subscript calculation. In the current situation
the hash number will give us the array subscript and we can
go to the right bucket immediately.
We won't have to go %3cdr%*-ing down the object list to get to the right bucket.
See the figure on {yon(P150)}.
Finally, here is a `pseudo-LISP' version of ⊗→%3ratom%*↔←:
.BEGIN TABIT2(17,20);FILL;
%3hash%*\is a function returning the bucket number of its argument.
%3insert%*\is a function to build the atom and insert it into to bucket.
%3right-one%*\is a predicate used to check if an atom
has the right print-name.
.NOFILL
.TURN OFF "←";
.SELECT 3;
⊗→ratom↔← <= λ[[ ]prog[[z;i;bucket]
\\i ← hash[chr-str];
\\bucket ← oblist[i];
\a\[null[bucket] → return[insert[chr-str]]];
\\z ← get[car[bucket];PNAME];
\\[right-one[z;chr-str] → return[car[bucket]]];
\\bucket ← cdr[bucket];
\\go[a]]]
.END
%3get%* is a LISP system function. It is described on {yon(P52)}.
.SKIP TO COLUMN 1;
.GROUP
.P150:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂αααπααα⊃
⊂ααααααα←ααααααα←αααα←βα# ~ #αβα⊃
~ εαααβαααλ ↓
~ ~ ~ ~←$
~ ⊂ααα←ααα←ααα←βα# ~ #αβ→⊃
↓ ↓ εαααβαααλ ↓
... (to bucket 2) ... ...
↓ ↓ εαααβαααλ ↓
~ ... ⊂α←αβα# ~ ≤'~←$
~ ↓ εαααβααα$
~ %ααααααα→αααααα→αααααα⊃
↓ ↓
(to bucket 1) (to bucket n)
~ ~ ~
~ %→ ... ~ ⊂αααπααα⊃ ⊂αααπαα⊃
~ ⊂αααπααα⊃ ⊂αααπαα⊃ %α→~ # ~ #αβ→ ...→~ # ~≤'~
%→~ # ~ #αβ→...→~ # ~≤'~ %αβα∀ααα$ %αβα∀αα$
%αβα∀αβα$ %αβα∀αα$ ↓ %→ ...
~ ~ ⊂απααα⊃
(to atom 1: (atom m: ~∃~ #αβ→(p-list of
bucket 1) bucket 1) %α∀ααα$ atom 1: bucket n)
~ ↓
↓ ⊂απααα⊃ ⊂αααααααπααα⊃ ⊂αααπααα⊃
... ~∃~ #αβα→~ PNAME ~ #αβα→~ # ~ #αβ→ ...
%α∀ααα$ %ααααααα∀ααα$ %αβα∀ααα$
↓
⊂αααπαα⊃
~ # ~≤'~
%αβα∀αα$
↓
⊂ααααα⊃
~CAR≡≡~
%ααααα$
.END
Note: The top level of %3OBLIST%* is stored sequentially for fast access
by the hasher; the %3cdr%*-parts are chained together in a (sequential)
list so that the structure of the table will be like any other list. The
chained representation is used by any other LISP process. In particular
the garbage collector uses this linking.
.BEGIN CENTERIT;SELECT 2;
←Partial Object List; where atom m:bucket 1 is %3CAR%*
.END
.APART
.SS(A review of the structure of the LISP machine,LISP machine)
We have a good portion of the storage conventions for LISP set out.
A difficult area involves the organization of the data structures
to perform the correct binding and unbinding of variables. Before
we tackle that, we give a diagram showing the basic structure of LISP
memory.
.GROUP;
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
.TURN ON "?" FOR "\";
.TABS 50;
⊂αααααααααααααααααααααααααααααααααααααααα⊃
~?~
~ ### THE SUBCONSCIOUS ###?~
~ ∞3eval∞* and friends?~
~ ∞3read∞* and ∞3print∞*?~
~ the garbage collector?~
~ the base registers for marking;?~
~ these include: ?~
~ AC∞41∞*, ..., AC∞4n∞*?~
~ FS pointer?~
~ FWS pointer?~
~ symbol table(∞3OBLIST∞*) pointer?~
~ registers for partial results?~
~?~
εααααααααααααααααααααααααααααααααααααααααλ
~?~
~ ### POINTER SPACE ###?~
~ the free space list?~
~ those parts of S-exprs containing?~
~ ∞3car∞*- and ∞3cdr∞*-parts.?~
~?~
εααααααααααααααααααααααααααααααααααααααααλ
~?~
~ ### FULL WORD SPACE ###?~
~ the full word space list?~
~ atom print names?~
~ numbers?~
~?~
%αααααααααααααααααααααααααααααααααααααααα$
.END
.CENT(Structure of LISP memory)
.APART
It is time to see what effect all these changes will have on the
behavior of %3eval%*.
.GROUP;
Let's take an example: let's see what happens if we want to evaluate
.BEGIN CENTER;SELECT 3;
eq[x;A].
.END
This will be presented to the machine as:
.P156:
.BEGIN CENTER; SELECT 3;
(EQ X (QUOTE A))
.END
.APART
It will be read by %3read%*, becoming internally:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂ααααπααα⊃ ⊂αααπααα⊃ ⊂αααπαα⊃
~ EQ ~ #αβα→~ X ~ #αβα→~ # ~≤'~
%αααα∀ααα$ %ααα∀ααα$ %αβα∀αα$
~ ⊂αααααααπααα⊃ ⊂αααπαα⊃
%→~ QUOTE ~ #αβα→~ A ~≤'~
%ααααααα∀ααα$ %ααα∀αα$
.END
The reference to the atoms
%3EQ, X, A,%1 and %3QUOTE%1 are actually pointers to the
atoms⊗↓Recall the picture of %3NIL%* on {yon(P141)}.←.
The list-structure will be passed to the new evaluator.
The recognizer which checks if the input is a function-call will
give a positive response to our example. Namely it will search the
property-list of the atom %3EQ%* for the indicator %3SUBR%*,
%3EXPR%*, %3FEXPR%*, or %3FSUBR%*. This will be done
using %3getl[x;(EXPR SUBR FEXPR FSUBR)]%*⊗↓See {yon(P142)} for %3getl%*.←;
it will find %3SUBR%*.
Notice that a search done by %3getl%* will typically
be short. In the previous %3eval%* we could look forward⊗↓No pun intended.←
to a search of the entire a-list; here we need only search the property list
of the atom. Similarly, the piece of the new %3eval%* which
recognizes constants will have no difficulty in determining that
%3(QUOTE#A)%* represents a constant; it will return %3A%* as value.
Where the innovation appears is in the handling of variables.
The evaluator must recognize the occurrence of %3X%* in our example
as being a representation of a variable. Again we go to the p-list.
The evaluator will search the property-list of %3X%* for the indicator
%3VALUE%* using %3getl[x;(VALUE)]%*; if a value is found it will be the
current binding of %3X%*. Again the search is quite short as compared to
that faced in the previous %3eval%*. What's the catch? The problem is
in how to maintain the correct binding of a variable in the %3VALUE%*-cell
as we enter and exit function calls. This is the subject of the next section.
.SS(Binding revisited,bind,P78:)
We have seen beginning in {yonss(P6)} that the old symbol
table mechanism has the correct
effect for the proper evaluation of recursive functions. As
we bound variables to values on entry to a λ-expression, we
%3concat%*ed the new bindings onto the front of the symbol table. This
had two results:
%21.%* In the evaluation of the body of the λ-expression we got the
correct bindings of the variables.
%22.%* The old value was saved so that we could retrieve it on leaving
the λ-body.
Now with the new table organization we need to develop the same
facility. It is not sufficient for the binding of λ-variables to
simply change the contents of the ⊗→%3VALUE%*-cell↔←. This would violate
condition %22%*. Besides changing the value, we must save the prior
value.
The pattern of binding and unbinding seems clear at least for simple
λ-bindings. If we associated a list of values with each %3VALUE%*-cell
then binding a new value could be effected by %3concat%*-ing the new
value onto the front of the list. Then %3getl%* would only be able to retrieve
that last value. As we got ready to leave the body of the λ-definition we
could restore the prior value by setting the list to %3rest%*-of-the list.
This would be a sufficient solution; it would not be an efficient solution
however.
Lists of the form which we just suggested using occur quite frequently
in computer science. They are called %2⊗→stack↔←s%* or %2⊗→push-down#lists↔←%*.
The important characteristics of stacks are that they are lists such that
only the first element is accessible; new entries are added to the front
of the list and if an element is to be removed, it must be the %2first%*
element of the list. These characteristics are indeed present in our
binding, accessing and unbinding dilemma.
What is of interest to us now is that stacks have a particularly
efficient implementation. Due to the very regular way in which stacks are
manipulated, the ⊗→linked allocation↔← implementation is not necessary.
Instead a stack can be implemented as:
.BEGIN INDENT 10,10;GROUP
%21.%* A sequence of contiguous locations
%22.%* A pointer initialized %2before%* the first of these locations.
.END
.BEGIN INDENT 10,10;GROUP
%23.%* An operation, typically called %2push%* which places a new
object in the stack. This can be accomplished by
adding 1 to the value of the stack pointer, and then
putting a representation
of the object in the cell currently referenced by the pointer.
%24.%* An operation called %2pop%* which gets the first value in the stack
and then decrements the pointer by 1.
%25.%* Though the abstract structure of a stack does not involve
limitations on the length of stack-space,
any representation should include techniques
for assuring that the stack pointer stays within its alloted space.
.END
We will resort to such a stack implementation to handle the binding
problems in the new %3eval%*. Thus we will submerge more of the evaluator
into the machine-dependent formulation: first it was the symbol tables;
now its the binding operations. However by now you should have a clear
understanding of what and why we're doing all this. That understanding
is best cultivated through the machine independent %3eval%*.
Notice that the %3concat%* operation can be interpreted as "push"-ing
and the %3rest%* operation as "pop"-ing. Indeed our earlier manipulations
of symbol tables effectively used such stack operations.
This is particularly apparent in the representation
of symbol tables given on {yon(P43)}.
.SS(%aSM%*-%3eval%*,%3eval%*,P30:)
Finally, we are ready for the new evaluator. The main innovations involve
the use of the new symbol table and the new binding mechanisms.
This version of the %3eval-apply%* pair is more representation-dependent
than the previous description, reflecting the fact that more of the
representation is embedded in these functions.
In preparation for a lower-level description of LISP's implementation
we designate the new %3eval%* as %aSM%*-%3eval%* where %aSM%* stands
for %aS%*hallow %aM%*achine⊗↓After seeing the machine you may wish to
make a different interpretation of %aSM%*.←.
The structure of the following algorithms is a compromise between abstraction and
representation; we have attempted to be abstract even at the cost of some
unacceptable inneficiencies; but since we are also attempting to show
lower-level implementation details, we have used more representation
dependencies than usual.
.BEGIN TABIT1(10);SELECT 3;
.GROUP;
eval[x] <=
\[numberp[x] → x;⊗↓%1An explicit recognizer for %3QUOTE%1-ed expressions
does %2not%* appear. %3QUOTE%* is coded as a special-form routine as we shall see.←%3
\ is_var[x] → var_val[getl[x;(VALUE)]]
\ is_fun_name[first[x]] → fun_val[x;getl[first[x];(EXPR FEXPR MACRO)]]
\ is_var[first[x]] → comput_fun_val[x;getl[first[x];(VALUE)]]
\ %et%* → apply[func[x];mapfirst[function[eval];args[x]]]
\ ]
.APART
.BEGIN FILL;SELECT 1;
This version of %3eval%* requires only one input; the symbol table is global.
The input is a representation of a form to be evaluated. If the form is a
number, return the number; if the form is a variable, call %3var_val%*
with the result of %3getl%*ing the %3VALUE%*-cell.
The atoms %3T%* and %3NIL%* which are representations of the truth-values
and are treated as variables which are permanently bound to the expected
values.
The form might also be a representation of %2f[a%41%*;#...#;a%4n%*]%1; in this
case we use %3getl%* to discover the type of %2f%*: is it an ordinary
λ-definition, a special form, or a macro? %3fun_val%* decodes this information
and performs the required evaluation.
%2f%* may also be a simple variable; in this case we call %3comput_func%*
with the current contents of the %3VALUE%*-cell.
Finally %3x%* may be a function-application where %2f%* is a complex expression
which requires further evaluation before the actual function is discovered.
These last two cases are called %2⊗→computed functions↔←%*.
.END
.GROUP
var_val[x] <=
\[null[x] → err[UNBOUND VARIABLE];
\ %et%* → second[x]
\ ]
.BEGIN FILL;SELECT 1;
%3var_val%*'s argument is the result of %3getl%*-ing the %3VALUE%*-cell.
If %3x%* is %3NIL%* then no value was found and an error message is
requested; if a non-%3NIL%* binding is given to %3x%*, then the real
value is %3second%*.
.END
.APART
.GROUP
fun_val[call;def] <=
\[is_expr[def] → apply[func[def];mapfirst[function[eval];args[call]]];
\ is_fexpr[def] → apply[func[def];list[args[call]]];
\ is_macro[def] → eval[apply[func[def];list[call]]]];
\ ]
.BEGIN FILL;SELECT 1;
This procedure handles three calling sequences: an %3EXPR%* is the
usual call-by-value case; a %3FEXPR%* is a special form; don't evaluate
the arguments, but pass them directly to the function.
The third calling sequence is called a macro call.
We will study
special forms and macros in more detail in {yonss(P18)}.
.END
.APART
.BEGIN FILL;SELECT 1;
There are actually two other indicators recognized by %3eval%*. ⊗→%3SUBR%*↔← is
the machine-language version of an %3EXPR%*; and ⊗→%3FSUBR%*↔← is the machine
version of a %3FEXPR%*. %3CAR, CDR, CONS, EQ,%* and %3ATOM%* are %3SUBR%*s;
%3COND, LAMBDA,%* and %3PROG%* have %3FSUBR%* properties.
.END
.GROUP
comput_fun_val[x;y] <=
\[null[y] → err[UNDEFINED FUNCTION];
\ %et%* → eval[mkcall[val[y];args[x]]]
\ ]
.BEGIN FILL;SELECT 1;
%3comput_fun_val%* is used in the evaluation of a computed function.
.END
.APART
.GROUP;
.TURN OFF "←";
apply[fn;args] <=
\[atom[fn] → [get[fn;EXPR] → apply[get[fn;EXPR];args];⊗↓%1Note that we are
using %3get%* as a predicate, relying on the knowledge that %ef%* maps to %3NIL.←
\ %et%* → apply[eval[fn];args]];
\ is_lambda[fn] → prog[[z]
\ bind[vars[fn];args];
\ z ← eval[body[fn]];
\ unbind[];
\ return[z]];
\ %et%* → apply[eval[fn];args] ]]
.END;
The crucial parts of the new binding scheme are expressed
in the new version of %3apply%*.
%3bind%* is a function, taking two arguments. The first is the list of
λ-variables; the second is the list of new values. %3bind%* saves the
old values and rebinds the λ-variables to the new values.
%3unbind%* is a function of no arguments used to restore a list
of λ-variables to their previous values. How is
this wonderous feat performed?
We have a ⊗→stack↔←, or ⊗→pushdown-list↔←, called SP
(for %2S%*pecial %2P%*ushdown) in which we can
save old values of variables.
%3bind%* first saves both the value
and the location of the %3VALUE%*-cell for each variable being rebound;
it then puts a special mark in the top of the SP stack to delimit
each block of λ-rebindings.
All %3unbind%* need do is restore the first block
of saved values. It knows the values and also knows the addresses of
the %3VALUE%*-cells.
All of the information it needed for restoration is saved in the stack.
This stack of previous values is also visited by the garbage collector;
it may be that the only copy of some value is accessible only through
the SP-stack. It would be most unfortunate if the garbage collector neglected
to mark that entry and the unbinding mechanism later tried to restore the value.
Finally here's a sample session with the binding mechanism:
Assume %3X%* and %3Y%* are currently bound to %3(A#.#1)%* and
%3(B#.#2)%* respectively; and assume we wish to evaluate
.GROUP
.BEGIN CENTER;
%3((LAMBDA(X#Y)(%9x%*# X Y)) (QUOTE C)(QUOTE D))%1.
We assume SP is in some well-defined state:
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
part of atom for X part of atom for Y
⊂αααααααπααα⊃ ⊂αααπαα ⊂αααααααπααα⊃ ⊂αααπαα
SP ~ ~ VALUE ~ #αβ→~ # ~ #→... ~ VALUE ~ #αβ→~ # ~ #→...
⊂αααα⊃ εααααααααααααλ %ααααααα∀ααα$ %αβα∀αα %ααααααα∀ααα$ %αβα∀αα
~ #αβα→~*MARK* #ααβ→⊃ ↓ ↓
%αααα$ εααααααααααααλ ↓ ⊂αααπααα⊃ ⊂αααπααα⊃
~ last entry ~ ~ ~ A ~ 1 ~ ~ B ~ 2 ~
εααααααααααααλ ~ %ααα∀ααα$ %ααα∀ααα$
...
↓
εααααααααααααλ←$
~*MARK* #ααβ→⊃
εααααααααααααλ ↓
...
.END
.APART
.GROUP
.BEGIN CENTER
Now we save the currrent value of %3X%* and the location of its %3VALUE%*-cell:
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂αααααααααα→αααααααα→αααααααααα→αααααααααα→α⊃
SP ~ ↑ ~ ~
⊂αααα⊃ εααβααπαααααλ ~
~ #ααβαα→~ # ~ #ααβ→ααα⊃ part of atom for X ↓
%αααα$ εααααα∀αααααλ ↓ ⊂αααααααπααα⊃ ⊂αααπαα
~*MARK* #ααβ→⊃ ~ ~ VALUE ~ #αβ→~ # ~ #→...
εαααααααααααλ ↓ ~ %ααααααα∀ααα$ %αβα∀αα
~last entry ~ . ↓ ↓
εαααααααααααλ . ~ ⊂αααπααα⊃
... . %αααααα→ααααααααα→αααααααα→~ A ~ 1 ~
%ααα∀ααα$
.END
.APART
A similar "push" saves the value of %3Y%*. Once this is done we are
free to bind %3X%* and %3Y%* to the new values.
.GROUP
.BEGIN CENTER
Here's what we have after rebinding %3X%* to %3C%* and %3Y%* to %3D%*:
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂αααααααααα→αααααααα→αααααααααα→αααααααααα→α⊃
~ ~
SP ~ ↑ ↓
⊂αααα⊃ εααβααπαααααλ ~
~ #ααβαα→~ # ~ #ααβ→ααα⊃ part of atom for Y ↓
%αααα$ εαααααβαααααλ ~ ⊂αααααααπααα⊃ ⊂αααπαα
⊂α←ααβαα# ~ #ααβ→⊃ ~ ~ VALUE ~ #αβ→~ D ~ #→...
~ εααααα∀αααααλ ~ ~ %ααααααα∀ααα$ %ααα∀αα
↓ ~*MARK* #→ ↓ ↓
~ εαααααααααααλ ~ ~
~ ~ last entry~ ~ ~
~ εαααααααααααλ ↓ ~ ⊂αααπααα⊃
↓ ~ %ααααααααααα→ααααααααααααα→~ B ~ 2 ~
~ ⊂αααπααα⊃ ~ ⊂αααπααα⊃ %ααα∀ααα$
%α→~ C ~ #αβ→ ... %αα→ααα→~ A ~ 1 ~
%ααα∀ααα$ %ααα∀ααα$
part of atom X
.END
.APART
.GROUP
.BEGIN CENTER
Then we top off SP to acknowledge a block of saved bindings:
.END
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
SP ~ ~
⊂αααα⊃ εααααααααααααλ
~ #αβα→~*MARK* #ααβ→⊃
%αααα$ εααααααααααααλ ~
~ saved Y ~ ↓
εααααααααααααλ ~
~ saved X ~ ↓
εααααααααααααλ←$
~*MARK* #ααβ→⊃
εααααααααααααλ ↓
...
.END
.APART
Notice that %3X%* and %3Y%* have values %3C%* and %3D%* respectively
now and the previous bindings are saved on
the SP-stack. We may now begin the evaluation of the expression %3(%9x%*#X#Y)%1
assured that we will get the expected values for %3X%* and %3Y%* and
assured that we will be able to restore the previous values afterwards.
%3unbind%* simply "pops" entries off of the top of SP, using
the information stored there to restore the old values; %3unbind%*
stops as soon as it has popped the first block.
This binding scheme, called %2⊗→shallow binding↔←%* works quite well for
simple λ-binding and lookup. Changing environments is a bit of work, but
the access to the values of variables is relatively rapid.
Communication of global variables between functions
is not problematic: just look down the atom structure for the %3VALUE%*-cell.
Essentially, the %3VALUE%*-cell will %2always%* contain the current value
of its variable. That shallow-binding has some drawbacks becomes
apparent when we examine procedure-valued variables.
How can we
implement the equivalent of %3FUNARG%* in this new binding and symbol table
organization? Recall how the old %3eval%* coped ({yon(P79)}).
When we recognized an instance of a functional argument we saved a pointer to
the current environment. When we %2applied%*
the functional argument we restored the symbol table in such a way
that global variables were accessed in the saved environment while
local variables were found in the current environment.
We must try to do the same with the shallow-binding.
The action taken when a functional argument is recognized is quite similar
to our previous solution:
when %3function%* is seen save the current SP pointer, rather than
the current environment name. This action manufactures a triple
%3(FUNARG#%*<function>#<old SP>%*)%1. However the action required when we
wish to apply the functional argument is much more complicated. Before,
we just set up a new access chain such that the local table referred to
the environment saved by the %3FUNARG%* construction whenever a global
value was needed.
The main problem with the shallow-binder is that the Special Pushdown
List only reflects the %2incremental%* changes in the environments
during the computation. To retrieve the environment current when the
functional argument was bound, we must unwind SP back to the state
which was then operative; we must also save the %2current%* state
so that we may return to %2it%* when finished with the functional argument.
Now for more detail:
when we apply the functional argument,
.BEGIN centerit;
%3←(FUNARG %*<function> <old SP>%*) %1to arg%41%*; ... arg%4n%3 ,
.END
we will first rebind all of the variables found by the old SP pointer to
those old values, while saving the current values in SP. Then we can rebind the
λ-variables (i.e., local variables) of the <function> to arg%41%* through
arg%4n%*. The environment will then be properly restored for evaluation of the
function body. When the evaluation has been completed we restore using %3unbind%*.
This process is complex enough to warrant an example:
.CENT(An example of shallow binding and %3FUNARG%*)
As an example of the complications involved in handling functional arguments
in shallow binding we offer the following:
.BEGIN INDENT 0,5;GROUP
%2I%* Assume that %3x%* initially has value %31%* and the the SP pointer
is at location %bSP1%*,
%2II%* then assume that a λ-binding rebinds %3x%* to %32%*;
%2III%* in this new context, assume a functional argument, %3g%*,
is to be bound to a function-variable %3f%*.
%2IV-V%* As the computation continues %3x%* is rebound first to %33%* and
within %2that%* context rebound again to %34%*.
%2VI%* Finally %3f%* is applied; this will resurrect %3g%*⊗↓From the %3VALUE%*-cell
of %3f%* as %3(FUNARG G SP2)%*.← requiring
a restoration of the environment current at step %2III%*.
.END
.GROUP
Steps %2I%* through %2V%* would lead to the following sequence.
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
F: (FUNARG G #)
X: 1 X: 2 X: 2 ↓
εαααβαααλ εαααβαααλ ⊂ααα←ααα$
SP1 αα→~*MARK* ~ =II=> SP2 αα→~ X ~ 1 ~ =III=> ↓
εαααπαααλ αααα∀αααλ SP2 ∀α→~ X ~ 1 ~
... ~*MARK* ~ ...
εαααπαααλ
...
SP1 αα→~ ... ~
X: 3 X: 4
=IV=> SP3 αα→~ X ~ 2 ~ =V=> SP4 εαααβαααλ
εαααβαααλ %α→~ X ~ 3 ~
. . . εαααβαααλ
SP2 αα→~ X ~ 1 ~ . . .
εαααβαααλ
...
.END
.APART
Now to apply the functional argument: %3(FUNARG G SP2)%*.
This is accomplished by tracing down the SP stack with a pointer %bSP*%*,
moving from %bSP4%* --the#current#stack#pointer-- down to %bSP2%*
--the#%3FUNARG%*#pointer--, reversing all the intervening bindings on SP
and putting the saved values back into the %3VALUE%*-cell.
The pattern of these reversals must be saved; we do this by adding a new
block above %bSP4%*.
.GROUP
Thus, gasp, steps %2VII%* and %2VIII%*:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
X: 2
εαααβαααλ
X: 3 SP6 α→~ X ~ 3 ~
...
εαααβαααλ εαααβαααλ
SP5 α→~ X ~ 4 ~ SP5 α→~ X ~ 4 ~
εαααβαααλ εαααβαααλ
... ...
SP4 α→~ X ~ 3 ~ SP4 α→~ X ~ 3 ~
εαααβαααλ εαααβαααλ
... ...
=VII=> SP3 α→~ X ~ 2 ~ ←αα SP* =VII=> SP3 α→~ X ~ 2 ~
... ...
εαααβαααλ εαααβαααλ
SP2 α→~ X ~ 1 ~ SP2 α→~ X ~ 1 ~ ←αα SP*
... . . .
.END
Now we are in a position to evaluate the call on %3g%*; when we are finished
fondling %3g%* we will use the unbinding mechanism to reinstate the world
as it existed at %bSP4%*. This process will restore the %3VALUE%*-cells using
the areas of the stack between %bSP6%* and %bSP4%*.
Compare the shallow binding strategy to that of the a-list. The a-list was always
explicit when evaluation was occurring; saving environments only required saving
a pointer to the current symbol table; changing environments at most required
calling %3eval%* with a different symbol table, and when leaving a context
the old table was restored implicitly by the recursion mechanism. Accessing
a variable involved an appreciable computation; we had to search the table
for the variable-value pair.
Shallow binding obviates the symbol table search while incurring added
expense in binding and unbinding.
The care and feeding of functional arguments is more difficult since there
is only one symbol table, not the tree of tables implicit in the
previous incarnation. True, the information necessary to recover an environment
is present in SP, but it is expensive to retrieve it.
Though the shallow binding strategy %2will%* perform for functional arguments,
it will involve even more complexity if we wish to handle functional values.
The difficulty here is that a functional value's %3FUNARG%* will point
"up" the SP stack rather than "down" as is the case for functional arguments.
A straightforward application of the technique used for functional arguments
simply will not work. At the time we wished to apply the functional value
its saved SP-pointer will be pointing into a section of the SP stack
which no longer exists. For when we leave the environment which created the
functional value the current unbinding mechanism will cut the stack back to
the point which existed when we entered that environment.
For example, consider the evaluation of a form like:
.BEGIN CENTERIT;SELECT 3;
←f[a%41%*; ...;a%4n%*]%1 where:
%3←f <= λ[[x%41%*; ... x%4n%*] ...g[ ...] ...i[ ...]],
←g <= λ[[ ...] ...h[ ...] ], %1and %*
←i <= λ[[ ...] ...j[ ...] ]
.END
Typically a picture like the following occurs, where the
instance of function name means a block of special bindings necessary to
begin evaluation of that function; as the stack grows back up for evaluation
of %3i%* and %3j%*, the cells used for %3g%* and %3h%* are re-used:
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂ααα⊃ ⊂ααα⊃
~ h ~ ~ j ~
⊂ααα⊃ εαααλ ⊂ααα⊃ ⊂ααα⊃ εαααλ ...
~ g ~ ~ g ~ ~ g ~ ~ i ~ ~ i ~
⊂ααα⊃ εαααλ εαααλ εαααλ ⊂ααα⊃ εαααλ εαααλ ...
~ f ~ ~ f ~ ~ f ~ ~ f ~ ~ f ~ ~ f ~ ~ f ~
%ααα$ %ααα$ %ααα$ %ααα$ %ααα$ %ααα$ %ααα$
.END
However, if %3h%* say, generated a functional value which is to be applied
in the context of %3j%* then
we must in general retain those
values in the %3f-g-h%* stack in such a way that they may be used to restore
the enviroment when we desire to apply the functional value in the %3f-i-j%* stack.
Granted a more clever binding mechanism, the restoration
process itself is non-trivial: unwrap the bindings %3j%* and %3i%*, as we
did for functional arguments, then go up the bindings to the environment of
%3h%*.
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";
⊂ααα⊃ ⊂ααα⊃
↑ ~ h ~ ~ j ~ ~
~ εαααλ εαααλ ↓
bind ~ ~ g ~ ~ i ~ ~ unbind
↑ εααα∀αα∀αααλ ↓
~ ~ f ~ ↓
%αααααααααα$
.END
However we must save the incantations so that we may restore the state
to that of %3j%* %2after%* the functional value has been applied.
A second difficulty arises even in the simpler case of functional arguments:
since a %2copy%* of the binding environment is being generated in the unwinding
process, changes in %2that%* environment will not be reflected when we restore
the activiation environment.
The real difficulty is in attempting to
represent the tree-like structure of environments by using the essentially
linear representation of a stack. It is just insufficient. Thus we have:
.BEGIN CENTER;
Shallow binding: 0; a-list: 1; Christians: 2; lions: 4.
.END
We will present an
alternative binding strategy in {yonss(P148)} called %2⊗→deep binding↔←%*.
This alternative is a more intuitive implementation of our original a-list
version of %3eval%* than is the shallow binding scheme. We will contrast their
relative efficiencies.
.SS(Macros and special forms,macros,P18:)
Most of the discussion to this point has dealt with obtaining
a new efficient symbol table which can fulfill our requirements
of permanence and multiplicity of properties. Little has been said
about user controlled function calling, which we called the desire for
generality. We shall now deal with that aspect.
Recall our discussion of ⊗→special forms↔← in {yonss(P8}).
Special forms have been used for two purposes: to control the evaluation
of arguments (conditional expressions, quoted expressions,
%3and, or%*, etc.), and to create the effect of functions with an indefinite
number of arguments (%3list, append, plus, ...%*)
Frequently this second application of special forms can be improved upon,
particularly in the presence of a compiler.
Describing such functions as special forms is sufficient,
but not efficient, since the body of the definition must
contain explicit calls on %3eval%*. Even though we wish to define
some functions as if they had an arbitrary number of arguments, when
we %6use%* (i.e. call) the function, the number of arguments to be
applied %6is%* known. In particular, when the compiler is examining the
program, the number of arguments is known. Hopefully the compiler can be
made smart enough to compile better code than calls on %3eval%*.
That hope is well-founded as we shall see in {yonss(P57)}.
In the meantime we will discuss a language device called %2macros%*
simply as a notational convenience.
Assume, for example, we wish to define %3plus%* as a function with
an indefinite number of arguments:
.BEGIN CENTERIT;SELECT 3;GROUP;
←plus[4;5] = 9
←plus[4;5;6] = 15
←plus[4;add1[2];4] = 11
.END
That is %3plus%* is to have the properties of a function: its arguments
are to be evaluated (from left to right); but it can take an arbitrary number
of arguments.
What is needed here seems clear: define %3plus%* in terms of a %2binary%*
addition function, %3*plus%*.
.BEGIN CENTERIT;SELECT 3;GROUP;
←plus[4;5] = *plus[4;5] = 9
←plus[4;5;6] = *plus[4;*plus[5;6]] = 15
←plus[4;add1[2];4] = *plus[4;*plus[add1[2];4]] = 11
.END
That is, we %2expand%* calls on %3plus%* into a composition of calls on
%3*plus%*.
%3plus%* is being used as a %2macro%* and the expansion process in terms
of %3*plus%* is called %2macro expansion%*. Thus macro expansion
generates a composition calls on %3*plus%*.
Realize too, that since LISP programs must perform equivalently when
interpreted, we must recognize a macro construction inside %3eval%*.
How are macros written and how do they work? The body of a macro is a
λ-expression of %2one%* argument. The %2use%* of a macro looks just like
an ordinary function call, but what is bound to the λ-variable is the whole
call on the macro. When you look at the evaluation mechanism for
macros in the %aSM%*-%3eval%* you will see that the result of the macro
expansion is passed to %3eval%*. Thus the task of the macro body is to
expand the macro call and return this expansion as its value.
.BEGIN TURN ON "#";
Let's define %3<%4m%*=%1 to mean "#is#defined#to#be#the#macro#...".
Then a simple macro definition of %3plus%* might be:
.END
.BEGIN SELECT 3;TABIT2(10,25);GROUP;
.P58:
\plus <%4m%*= λ[[l]\[eq[length[l];3] → concat[*PLUS;cdr[l]]
\\ %et%* → list[*PLUS;second[l];concat[PLUS;rest[rest[l]]]]]]
.END
Thus a call %3(PLUS 3 4 5)%* would bind %3l%* to %3(PLUS 3 4 5)%* and
the evaluation of the body would result in %3(*PLUS 3 (PLUS 4 5))%*.
Evaluation of this expression would result in another call on the macro.
This time %3l%* would be bound to %3(PLUS 4 5)%*. Now
%3eq[length[l];3]%* %2is%* true and the value returned is %3(*PLUS 4 5)%*.
This can be evaluated, giving 9, and finally the outermost call on %3*PLUS%*
has all its arguments evaluated, and we get the final answer, 12.
This is a very simple (and inefficient) macro definition. Since the body
of a macro has available all of the evaluation mechanism of LISP, and
since the program structure of LISP is also the data structure, we can
perform arbitrary computations inside the expansion of the macro. Truly
LISP macros have the power to cloud men's minds.
Notice that %3SETQ%* can easily be defined as a macro
over %3SET%*:
.BEGIN CENTERIT;SELECT 3;GROUP;
←setq <%4m%*= λ[[l] list[SET;list[QUOTE;second[l]];third[l]]].
.END
.P54:
The effect of "<%4m%*=" should be clear:
it puts the body of the macro definition on the property-list of the macro
name under the indicator, %3MACRO%*. Likewise "<=" puts the function body
on the p-list under the indicator, %3EXPR%*. Similarly, we will define
"<%4f%*=" to mean "..is defined to be a special form..." and whose effect
is to put the body under the indicator, %3FEXPR%*. The body of a fexpr
definition is a λ-expression of (usually) a single λ-variable, say %3l%*. When the
special form is called we bind the list of %2unevaluated%* arguments to %3l%*.
Thus we could define %3plus%* by:
.BEGIN SELECT 3; GROUP;TABIT2(10,12);turn off "←";
.P59:
plus <%4f%*= λ[\[l] prog[[sum]
\\sum ← 0;
\a\[null[l] → return[sum]];
\\sum ← *plus[sum;eval[first[l]]];
\\l ← cdr[l];
\\go[a]]]
.END
.BEGIN SELECT 3;CENTERIT;GROUP;TABIT2(20,35);
%1Or %3and%* could be defined as:
←%3and <%4f%*= λ[[l]evand[l]]%1
where %3evand%* is defined as:
%3
\evand <= λ[[l][\null[l] → %et%*;
\\eval[first[l]] → evand[rest[l]];
\\%et%* → %ef%*]]
%*
.END
Notice that this definition of %3evand%1 differs from the one on
{yon(P53)}. The earlier definition required an explicit symbol table;
the newer one uses the global table. This is a mixed blessing. As usual
there are difficulties in getting the correct bindings of variables.
Most applications of %3FEXPR%*s involve explicit calls on %3eval%*
within the body of the %3FEXPR%*.
Consider the following definition:
.BEGIN TABIT2(20,37);SELECT 3;TURN OFF "←";GROUP;
\wrong <%4f%*= λ[[x] prog[[y]
\\y ← 2;
\\return[eval[first[x]]]]
.END;
If we execute %3wrong[0]%*, %3x%* will be bound to the list %3(0)%* and
%3eval[first[x]]%* will return %30%* as expected. But if we execute:
.BEGIN TABIT2(20,37);SELECT 3;TURN OFF "←";GROUP;
\y ← 0;
\wrong[y];
.END
then %3x%* gets bound to %3(Y)%*, and %3eval[Y]%* will find the
value associated with %3Y%* to be %32%*. Thus
the value of %3wrong[y]%* is %32%*,
rather than the expected %30%*.
Clearly, the problem is that the call on
%3eval%* takes place in the wrong environment. To alleviate this situation
%3FEXPR%*s may be defined with %2two%* arguments. In this case a %2call%*
on the %3FEXPR%* will bind the environment %2at the point of call%* to that
second argument. %3eval%*,and %3apply%* are allowed to be called with either
one or two arguments. If a second argument is present, it is used as the
environment during evaluation.
Thus:
.BEGIN TABIT2(20,39);SELECT 3;TURN OFF "←";GROUP;
\right <%4f%*= λ[[x;a] prog[[y]
\\y ← 2;
\\return[eval[first[x];a]]]
\y ← 0;
\right[y];
.END;
The call on %3right%* will use the environment with %3y%* being %30%*
rather than %32%*.
Macros can be used to perform the operations which special forms
do, but with perhaps more efficiency.
Macros can also be used with great verve in implementing abstract data structures.
The constructors, selectors, and recognizers which help characterize
a data structure can be expressed as very simple S-expr operations.
These operations are performed quite frequently in a data structure algorithm
and so any increase in their running efficiency will be beneficial.
On {yon(P73)} we will show how macros can be used to speed up
these data structure primitives.
The idea of macro processing is not recent. Some of the earliest assemblers
had extensive macro facilities. Lately macros have been used as a means
of extending so-called high level languages.
One of the most simple kinds of macros is %2textual substitution%*.
That is, when a use of a macro is detected we simply replace the call by its
body. A slightly more sophisticated application is the %2syntax macro%*.
Everytime we come across an application of a syntax macro the expander
processes it as if it had never been seen before even though much
of the expansion is repetitious. That is, syntax macros have no memory.
%2computational macros%* are an attempt to reduce some of this repetition.
In this scheme a certain amount of processing is done at the time the macro
is %2defined%*. For example a computational subtree reflecting the body of the
macro might be formed. Then whenever the macro is %2used%* we can
simply make a copy of this subtree and "glue" this subtree into the parse-tree
which we are building. This computational subtree is commonly formed by passing
the body of the macro through the compiler in a
"funny" way. The main problem with the computational macro is that there are
many desirable macros which have no such subtree, or there is other information
necessary to process the macro. There are solutions to this problem,
one of which closely parallels the abstract syntax ideas of McCarthy.
All of these styles of macros are subsets of the LISP macros.
.P157:
Many versions of LISP also have a very simple syntax macro (called ⊗→%3read%* macros↔←)
independent of
the <%4m%*= - facility. Constants appear quite frequently in LISP expressions;
and so we have already introduced abbreviations for the S-expression
representation of numbers, %et%* and %ef%*. That is we don't %3QUOTE%* them.
%3read%* macros are used to abbreviate other S-expressions. The most
common %3read%* macro is used to abbreviate the construct:
.TURN ON "←";
←%3(QUOTE %*<sexpr>%3)%*.
A special character is chosen to designate the macro, say "@".
Whenever %3read%* sees "@" the next S-expression, %9α%*, is read and
the value: %3(QUOTE %9α%*)%1 is returned by %3read%*. Thus:
←@<sexpr> is an abbreviation for %3(QUOTE <sexpr>).%1
In some systems the user may define his own %3read%* macros, in others
they are pre-defined.
.CENT(Problems)
I. Define %3list%* and %3append%* as macros. You may use only
the LISP primitives (functions, predicates, conditionals and
recursion) and a binary function, %3*append%*.
II. Give a macro definition of an extended %3SETQ%*, which is called
as %3(SETQ#%1var%41%*#exp%41%*#...#var%4n%*#exp%4n%3)%1.
Each var%4i%* is a name; each exp%4i%* is an expression to be evaluated
and assigned to var%4i%1. The assignments should go from "left-to-right".
Thus %3(SETQ X 2 Y (TIMES 2 X) X 3)%* when executed should assign
%33%* to %3X%* and %34%* to %3Y%*.
III. Define %3list%* as a special form.
.CENT(Problems)
I Evaluate the following using the new %3eval%*:
1. %3eval[((LAMBDA(L)(PRINT L))(QUOTE A))]%* assuming %3print%* is the usual printing function.
2. %3eval[(FOO(QUOTE A))]%* where %3foo <= λ[[l]print [l]].%1
3. %3eval[((CAR(QUOTE (FOO)))(QUOTE A))]%*, with %3foo%* as above.
4. %3eval[(FOO1(QUOTE A))]%* where %3foo1 <%4f%*= λ[[l]print[l]]%1.
and finally:
5. %3eval[((CAR (QUOTE (FOO1)))(QUOTE A))].%1
Explain what happened. Can you "fix" it?
II Some implementations of LISP distinguish between %3FEXPR%*s and %3EXPR%*s
by modifying the prefix, %3LAMBDA%*. %3FEXPR%*s are stored with a prefix
%3NLAMBDA%*; %3EXPR%*s are stored with the normal %3LAMBDA%* prefix.
What are the advantages and drawbacks of this scheme?
III Recall the extensions to LISP-binding proposed in {yonss(P67)}. Discuss
their implementation in the light of the new %3eval%* and symbol table.
.SS(An example of %aSM%3-eval%1)
Now that we've seen the basic structure of the new evaluator, seen the
new calling sequences, and seen the new binding strategy, it might
be helpful to put it all together in an example.
We shall embellish the example on {yon(P156)} slightly.
Assume that we want to evaluate %3f[A]%* where %3f#<=#λ[[x]eq[x;A]]%1.
It is clear that the value should be %et%*. Let's see
how it would be done. First %3f%* and its definition would be encoded
as S-expressions; and since %3f%* is a call-by-value definition we
could introduce the definition to the symbol table by⊗↓In actual practice
we have much more civilized ways of defining functions.←:
.BEGIN CENTER;SELECT 3;
putprop[F;(LAMBDA(X)(EQ X (QUOTE A)));EXPR]⊗↓%1Actually since %2this%*
expression must also be presented to the machine %2it%* must be
coded as an S-expr. Using the read-macro facility of {yon(P157)} we could write
it as:
%3(PUTPROP#@F#@(LAMBDA(X)(EQ(X#(QUOTE#A)))#@EXPR)%1.
We might also think about writing an %3FEXPR%* version like %3putprop%* but which
did not require %3QUOTE%*-ing.←
.END
Now we could address %3eval%* with the question:
.BEGIN CENTER;
.SELECT 3;
(F (QUOTE A))
.END
After being mauled by %3ratom%* and %3read%*, the new %3eval%* would see
the expression. It would decide that the form of the expression was
a representation of function-followed-by-arguments. Since this evaluator
recognizes calling sequences other than call-by-value, it does %2not%*
automatically evaluate the argument; rather, it looks at the property list
for %3F%*. There it finds %3EXPR%*; it is call by value. We evaluate the
argument list. In the process we come across %3(QUOTE A)%*. Again the format
of the expression is function-followed-by-arguments; again we go to the
p-list. However %2this%* time we find a %3FSUBR%* indicator on %3QUOTE%*.
That says don't evaluate, but pass the argument list to the machine-language
code for %3QUOTE%*⊗↓You
should think about the code for %3QUOTE%*;
you should finally convince yourself that the routine for
%3QUOTE%* is the same as the routine for %3car%*←.
The code for %3QUOTE%* sees %3(A)%*; the code for %3QUOTE%*
returns %3A%*.
Now that the arguments for the call on %3F%* are evaluated, we dredge
up the definition of %3F%*. Since it is a λ-definition, we exercise the
special pushdown mechanism, saving the current binding of %3X%*⊗↓If an
atom has no %3VALUE%*-cell at this time we manufacture a default entry of
%3UNBOUND%*.← while rebinding %3X%* to %3A%*.
We begin the evaluation of the body of %3F%*. That entails a call on
%3EQ%*. As we know, that will find a %3SUBR%* property for %3EQ%*; we
evaluate the arguments of %3EQ%*. %3eval%* will recognize the occurrence
of %3X%* as that of a variable; it will look for the %3VALUE%* property,
it will find it and will return %3A%* as value. The evaluation of the second
argument we've seen before. It will still get %3A%*. Now that both
arguments to %3EQ%* are completed, we call the primitive routine for
%3EQ%*. %3EQ%* sees that both arguments are atomic and are indeed the same
atom; it returns the representation for %et%*. Finally we go through the
unbinding process, restoring %3X%* to its prior value.
.CENT(Problem)
%21.%* Write a %3FEXPR%* function called %3de%* which takes three
"arguments": the first is the name of a function the second is a
list of local variables and the third is a function body.
Thus %3(DE#F#(X)#(EQ#X#(QUOTE A)))%* would have sufficed in the current example.
.SS(Progs in %aSM%3-eval%1)
The implementation of %3prog%*s and their associated constructs is a study
in binding. A vertiable orgy of binding takes place:
.BEGIN INDENT 0,10;
.GROUP;
%21.%* On entrance to a %3prog%*: Since %3return%* and %3go%* perform their
duty in the dynamically enclosing %3prog%*, we must bind on some indication
that we have entered a %3prog%*.
.APART
.GROUP
%22.%* The %3prog%* variables: We λ-bind all %3prog%*-variables to %3NIL%*
as we proceed into the %3prog%*.
.APART
.GROUP
%23.%* The local labels: Typical implementations manufacture a special
internal list of bindings, identifying each label with its corresponding
piece of code. These bindings are separate from any other variable bindings
since we allow label names to clash with %3prog%*-variables or λ-variables.
Typically these extra bindings are called the %2⊗→%3go%*-list↔←%1.
The %3go%*-list is used by the %3go%*-construct to achieve faster execution;
instead of searching the S-expr representation of the %3prog%* every time
a %3go%* is to be executed, the value of the appropriate %3go%*-list entry
points to the code to be executed.
.APART
.GROUP
%24.%* Assignment statements: The binding done by an assignment statement
is of a different character than that performed by %3prog%*-variable, λ-variable,
or %3go%*-list bindings. These bindings "cover up" any previous bindings
such that after we leave the current context, that old binding can be restored.
An assignment is a %2destructive%* binding; it changes the current binding.
Obviously if the variable being assigned to is a local variety, then a previous
binding will be restored as we leave the %3prog%*; however if the variable
is %2global%* one then that global value will be changed. Such changes
are called %2⊗→side-effect↔←s%*.
.apart
Now for details:
.END
We should have a special form, call it %3evprog%*, to handle
interpretation of %3prog%*s. Assume %3evprog%* binds up the %3prog%*-variables
and manufactures a %3go%*-list as advertised above. Next, %3evprog%*
begins the sequencing implicit in the body.
.BEGIN GROUP;
This is essentially done by an algorithm like:
.BEGIN SELECT 3;CENTERIT;
←evprog_body <= λ[[l][null[l] → (); %et%* → prog2[eval_st[first[l]];evprog_body[rest[l]]]]]
%1where:
←⊗→%3prog2↔← <= λ[[x;y]y]⊗↓%1Due to LISP's call-by-value scheme, the effect of
%3prog2%* is to evaluate both arguments and return the second value.%3←
%1and:
%3←eval_st <= λ[[s][is_label[s] → (); %et%* → eval[s]]]
.END
.END
We have already discussed most of the evaluation process for the consitutients
of %3prog%*s. We should note that %3go%* will evaluate its argument until it
finds a label; it will then jump to that label, restoring the environment
surrounding that label. In an %3eval%* we would find something like:
.BEGIN CENTERIT;SELECT 3;GROUP;
←go[cdr[locate[eval_go[form];go_list]]]%1, where:
←%3eval_go <= λ[[exp] [is_label[exp]→ exp; %et%* → eval_go[eval[exp]]]].
.END
Assignments are handled in a straightforward manner:
put the appropriate value on the property list of the appropriate atom.
.BEGIN CENTER;SELECT 3;
is_setq[form] → putprop[name[form];eval[arg[form]];VALUE]
.END
would suffice.
.SS(An alternative: deep bindings,deep binding,P148:)
There are some weaknesses in the shallow binding scheme espoused
in the previous sections. First, as we have seen ⊗→shallow binding↔← requires
that we be a bit careful in handling functional arguments.
Also there is an appreciable overhead in changing contexts.
Second, our strategy of requiring argument passing in a fixed number of
registers, AC%41%* through AC%4n%*, is unnecessarily restrictive. There
will be some arbitrary limit on the number of arguments which a function
may have. After a moment's reflection on the representation of the symbol table
as a stack {yon(P43)}, it should be apparent that we might try passing the
arguments to a function requiring %3n%* arguments as the first %3n%* elements
of a stack. The value returned by the function could be defined to be the
value located in the top of the stack. This holds promise. Consider the
evaluation of a form
.BEGIN CENTERIT;SELECT 3;
←f[g%41%*[ ... ]; ... g%4n%*[ ... ]] .
.END
The evaluation of the arguments is to proceed from left to right. If we perform
the procedure "calculate the argument; leave value in top of stack", then %3f%*
could expect to find values for its λ-variables as:
.BEGIN TABIT3(20,40,60);GROUP;
\value stack:\| value for %3x%4n%1\|
\\| value for %3x%4n-1%1\|
\\| ... ... \|
\\| value for %3x%41%1\|
.END
There will be no problem about searching to find the values; %3f%* "knows"
where to find the values in the stack. When %3f%* is finished its computation
it need only remove the top n elements from the value stack and add the value
which it computed.
This scheme seems to have the advantage of shallow binding: fast access
to values, but none of the disadvantages. It looks like the a-list scheme
for binding and unbinding. What's the problem? It's global variables.
If %3f%* wants to access a global variable how can it be found? All
we have stored on the stack is the %2value%* and there is no way that %3f%*
can "know" %2which%* value is the value of the global variable.
Surely we can solve this problem: simply store %2pairs%*##---##name-value pairs.
So what's the problem now? The expensive calculation only arises when we
access global variables. Most variables %2are%* local aren't they ---
or are they? Function names are variables and are almost always global;
%3car%* is a variable name, whose "value" is a primitive routine to
compute the %3car%* function. Seldom do you wish to redefine %3car%*.
Searching the stack for name-value pairs %2is%* more expensive than
picking up the value cell of the shallow binding scheme. We will soon see
that LISP compilers for deep binding %2can%* be made efficient
in their access of local and global variables; but the interpreter
%2will%* have to search. One of the worst cases will be the execution of a loop,
in which accesses to the same variables occur frequently.
This, perhaps is another good reason for removing control of iteration
from the hands of the programmer. The extent of (or even the presence of) a loop
which the user is
controlling by tests and goto's is difficult to discover. If a loop
is controlled by language constructs (WHILE, REPEAT, etc.) then the
interpreter might have some chance of improving the execution of the loop.
As we have previously said, deep binding is very reminiscent of the a-list
symbol table structure which was first a stack, then embellished to become
a tree when functional arguments appeared. But recall that the a-list scheme
had each name-value pair chained to its successor. Pointers into the table
structure (environment pointers) entered into this chain at various places
to define an environment. With the a-list scheme, we were able to generate a
tree structure, but deep binding, so far, still has a stack-like structure.
Something must be done. Let's look at the a-list %3eval%* again.
First note that the environment pointers we generated for handling
functional arguments point into the symbol tables in very specific places;
they %2always%* point into the tables at the beginning of λ-bindings. They will
%2never%* point into the interior of a block of bindings. Thus each
%2block%* of λ-bindings can go onto a stack in a contiguous fashion.
Now recall the Weizenbaum environments: each environment has a local
symbol table (λ-variables, and %3prog%*-variables); each environment
also has entries for E%4a%* and E%4c%*. So we may simulate a tree on the
name-value stack if we include in each block of bindings, pointers
representing E%4c%* and E%4a%*. Thus the search strategy becomes yet a bit
more complex: when seeking a value for a variable run down the name-value
stack comparing names; when the block is exhausted we follow the E%4a%*-pointer.
To exit a block is easy and in fact much easier than the shallow binding
ritual: simply restore the E%4c%*-pointer.
The difficulties in handling functional values are more severe. The solution
advanced by the a-list implementation is overly expensive. There we kept the
symbol tables available by representing them as list-structure and therefore
they became a garbage collectable commodity. As long as a %3FUNARG%* construct
accessed a table then the table was retained. Essentially we had removed
the stack-like behavior of symbol-table accesses which occurs most of the
time and replaced it with a general scheme which
works for all cases but incurs a significant overhead in even the most
simple of function calls.
We would like an intermediate solution:
one which works for all cases and minimizes overhead in the general
run-of-the-mill call. Such a scheme can indeed be implemented.
Recall our discussion of garbage collection in {yonss(P146)}. There we said
that a garbage collector was used in LISP since the interrelationships
which we generated in the data structure manipulations were
sufficiently intertwined that it was not possible to use less
sophisticated method to determine whether a structure was still active.
Now local symbol tables %2are%* data structures; the discussion of
Weizenbaum environments in {yonss(P77)} should have convinced you of that.
They are chained
together in a manner reminiscent of that of our implementation
of S-exprs; indeed as we have just mentioned LISP's attitude toward
management of such tables was to garbage collect them. However the %2behavior%* of
tables during the execution of a program is much less complex than that
requiring a garbage collection. Again as we have just seen the behavior
is quite rational and predictable except for procedure-valued variables.
A solution giving a reasonable implementation is based on
the alternative storage management scheme of ⊗→reference counter↔←s which we
described on {yon(P145)} is described in a paper by D. Bobrow and B. Wegbreit.
The effect of their representation of the control for LISP is that
little overhead is extracted if the program does not use the more
exotic features: a stack-like device results. A larger toll is
paid for use of more general control regimes.
A final implementation note: it %2is%* possible to think about combining the
shallow and deep strategies, getting a "semi-shallow" ⊗↓way down deep it's shallow←
scheme. We have both name-value stacks %2and%* p-lists. We will try to use
the %3VALUE%*-cell as much as possible.
We associate an extra piece of information with each value attached to
any %3VALUE%*-cell, telling the binding time of that variable.
We have a global indicator telling the current context which we are using:
E%4current%*, say. When we want the value of a variable, we first go to the
augmented %3VALUE%*-cell; if the binding time indication is that of
E%4current%1, then the value is correct and we take it. If the
indicators disagree, we go to E%4current%* and start the name-value search;
when we find the variable we update the %3VALUE%*-cell and change the binding
time indicator to E%4current%*. This way we only search the stacks for the
first access to a variable; after that we can justifiably use the shallow binding.
The important point to keep in mind is that disaster will result if
implementation of language constructs is attempted before thorough understanding
is gained. Attempting to implement a language like LISP by using a
stack-like device for maintaining all variable bindings, either current or or
past, simply will not work.
.SS(Epilogue)
←%2Subtitled: But my Fortran program doesn't do all this crap!%1
It is quite true that a running Fortran, PL/1, or Algol program
is far less complicated as far as its symbol accessing mechanisms
are concerned. In Fortran there is a simple relationship between
variables and memory locations which will contain their values;
in Algol, there is a simple relationship between variables and positions
in the run-time stack.
In LISP, both the quality and the quantity of variables can change.
Arbitrary properties can be associated with atoms at run-time.
Indeed, the symbol table mechanism of LISP is more reminiscent of
that associated with the Fortran or Algol compiler. For these less
sophisticated languages it is the compiler which performs the mapping
from source language to running machine code. It is the compiler's
responsibility to discover the properties associated with each variable.
The compiler can do this because the semantics of the language is such
that at compile all (or almost all) properties of the variables are known.
This is not true for LISP. In general you cannot tell until run time what
the attributes of a particular atom are. The situation is really even worse
than this. Since programs and data are indistinguishable, we can %3cons%*-up a
list, assign it to a variable, then turn around and use that variable as
a function.
Now the world isn't all this grim. There are LISP compilers (written
in LISP naturally). They can make many decisions at compile time
about the properties of variables, but in general the compiled code
will be interspersed with calls on %3eval%*. That is, %3eval%* will have to make
some decisions at runtime, because the compiler just doesn't know what to do.
This implies that compiled and interpreted code must be able to
communicate with each other. If a piece of compiled code calls a
λ-expression or conversely, the right thing must happen. The execution
of the program should be totally transparent as to whether any, or all, or
none of the functions are compiled. This means that the calling
sequences for both kinds of functions must be compatible. Less
obvious and by far more troublesome, is the communication of global
values between functions. The next chapter of the text discusses
the run-time behavior required for implementations of LISP-like languages
and will cover a complete discussion of LISP compilers.
.END "JRA";